How I participated in the AppFog OpenStack Birthday Contest
This week Boris Renski and I are visiting OSCON 2012 in Portland, OR. So far it's been a lot of fun. One of the guys we met here was Lucas Carlson, CEO of AppFog, who gave us an invitation to join their contest.
The rules were very simple. You had to write an obfuscated program which outputs "Happy Birthday OpenStack", and deploy it using AppFog. I have to admit it's a great marketing move. Judging by the fact that the "last 10 submissions" window was constantly changing at http://openstack.appfog.com/, I think more than a hundred of OSCON developers have successfully created and deployed their first app using AppFog.
So I decided to give it a try as well. (No, I don't work for Google. It's just a t-shirt).
After reviewing a number of existing submissions it became obvious that 75% of them follow a very straightforward approach of having encoded versions of "Happy Birthday OpenStack" in the code. Think along the lines of Base64, ROT13, large arrays with constants and magic numbers. I have also seen unreadable code which seems to have incorrect syntax from the first glance, but actually can be executed by the language interpreter. There were also nicely formatted programs out there that look more like pictures formed out of pseudo graphics. People were even crawling web pages and scraping the required phrase out of them.
But I wanted to take a try at something quite different and write source code in such a way that you can read and understand every individual line, but can't quite tell why it actually prints what is supposed to print.
Part 1. Random
The main idea that came to my mind was that "Happy Birthday OpenStack" could possibly be printed using a random number generator. To start, I reduced the alphabet to 27 characters by lowercasing everything and making it "happy birthday openstack". Then, to make the alphabet contiguous, every space was replaced with "{" which actually equals to "z"+1, and the entire string became "happy{birthday{openstack".
Now it was a question of generating sequences of characters 'a' - '{' and finding the right rand seed to start from. So, let's imagine we are to generate all possible strings of 27 characters that have length of 24 (it's the length of the string about openstack that needs to be produced). The total amount of different strings is 27^24 and we actually only need one of those. It's quite impossible to deal with that large number of combinations, so it makes sense to split the entire string into 4 parts having 6 characters each (fortunately 24 = 4 * 6). So, the probability that a given rand seed produces the required 6 characters is 1/(27^6). It means that likely 1 out of 387420489 different rand seeds is going to be the right one. It's something we can deal with! In fact it's just a matter of brute forcing different rand seeds and checking if one produces the required substring. A few minutes of coding plus 12 minutes of computation time, and now we have the magic seeds to feed into random number generator: 873257270, 29148309632, 535114769, and 124848222.
Part 2. Hiding digits
Now I thought it would be nice to get rid of the integer constants that I have just found. It's quite hard to do as they still have to be encoded inside the program in some way, but the point is that they would rather not be on the surface. To keep the source code very readable, I decided to find English words that would hash to the given numbers using my own hash functions. Another 18 minutes of computation time over a large English dictionary and also over Hamlet by William Shakespeare, and I had all the words with the properties I needed. There were multiple options, but I ended up with choosing the following subset: "unpleasure", "indolence", "enumeration", and "hotdog". It made the things better - instead of having 4 numbers with 9 digits each, the code started to contain 4 different hash functions and a number of smaller integers, plus the chosen words themselves of course.
It would also be cool to hide numbers in the code structure itself (e.g. you can pull method return types, method name, method parameters through Java Reflection and construct the final numbers out of this data)!
Part 3. Reducing the number of constants
To keep the things more interesting, I am going to stop right there and leave the rest for you guys to figure out! The program is here and it generates "Happy Birthday OpenStack" out of nowhere (well, almost).