Today’s post is contributed by our Summer 2011 team intern, Chris Bunch. Chris did some great work on our Logs and MapReduce APIs and is also the first “App Engine Triple Crown” winner for developing the Experimental Logs Reader API in Python, Java and Go simultaneously.
Four years ago, I was a brand-new Ph.D. student at the University of California, Santa Barbara and when our research group (the RACELab) heard about Google App Engine, we were intrigued. We thought it presented a new model that enabled apps to scale the right way without severely constricting the types of programs users would write.
But we wanted to experiment with the core functionality of App Engine: the APIs, the scheduler, etc., and so we built AppScale, an open-source implementation of the Google App Engine APIs that allows users to deploy applications written in Python, Java, and Go to the infrastructure of their choice.
Wherever possible, we implement support for the App Engine APIs with alternative open-source technologies. We’ve added support for nine different databases, database-agnostic transactions, a REST interface that users of any programming language can communicate with (via an App Engine app), and the ability to run high performance computing programs over the whole thing and talk to it from your App Engine app. And here’s my favorite part - it all deploys automatically! You don’t need to tell it what block size you want for the distributed file system, or the size of the read buffers: we configure the necessary services automatically. Since AppScale is completely open source, if you don’t like the defaults, change them!
After creating our own system to run Google App Engine apps, I wanted to see how Google does it. Therefore, I decided to become an intern on the App Engine team and see if I could give them (and by extension, the App Engine community) something amazing over the summer. I started off with some work on the MapReduce API, making the sample app much easier to use and prettier all around. I also made a YouTube video showing how it all works and how easy it is to run MapReduce jobs over App Engine.
I then looked at a recurring question that App Engine users encounter: “How can I get my logging information for my application to answer data analytic questions?” It was an excellent problem to tackle, as we have users who want to be able to determine application-specific queries that Google Analytics or the Admin Console don’t answer. Currently users have to use appcfg to grab all their application’s data to a remote machine and run some analysis script over it.
To solve this problem, I created the Logs API, which gives applications programmatic access to their logs from within App Engine itself. Applications can use it to query small numbers of logs within a single request, and they can utilize the Pipeline, MapReduce, or Backends APIs if they have lots of logs they want to analyze. Logs contain both request-level information (e.g., the URL accessed, the HTTP response code returned) as well as logging info generated by the application (the logging module in Python, the Logger class in Java, and the logging methods that Go’s appengine package provides). The Logs API is available for use as of App Engine 1.6.1 by programmers using the Python, Java, or Go runtimes, in both the production environment and the local SDK.
I had a great time putting the Logs API together, and had a unique experience interning with the App Engine team. Programming in Python, Java, and Go on a daily basis was an exciting new challenge, and I loved it!
Interested in interning with the App Engine team? Check out google.com/students for more information on internships.
0 comments:
Post a Comment