Load Testing Arc2Earth Cloud Services
In this post we’ll be taking a look at how well A2E CS holds up under a heavy concurrent load on Google’s productions servers and what this means in terms of costs. Right now Google AppEngine is still a “preview release” but billing has been activated and it has proven to be a relatively stable platform. Also, at this time, A2E CS will only run on Google’s implementation of AppEngine. However, there are several open source projects in the works that should allow you to run it on either your own server farm or another Cloud infrastructure provider like EC2…although I would highly recommend staying on Google’s.
Today we will be testing two different types of functionality: simple spatial searches on vector Datasources and using a TileSet’s staticmap api to draw a single map image from multiple tiles. This represents a hypothetical iPhone webapp where layers are drawn over the standard base layers (for instance, those in Google Maps V3) using a single image and where the user can click on the map to search for features. DataSources:
- US County polygons (3100 features, 30+ attributes)
- ZipCode polygons (30K features, 10 attributes)
- US Counties TileSet drawn with Population renderer. Zoom levels 4-8
- US County Identify – spatial search based on random lat/lng, buffer point by 10K meters (polygon/polygon search). returns geojson
- US ZipCodes Identify - spatial search based on random lat/lng (point/polygon search). returns geojson
- Static Map – Creates a 300x400 image of the County Population tileset based on random center lat/lng and zoom level. returns raw image
To test this, I altered this python script created by Joe Gregorio, one of the engineers on the Google AppEngine team. The test script randomly runs one of the above scenarios across 15 different threads, downloads the response and immediately chooses another scenario. When choosing random lat/lng values, I tried to include as much of the US as possible, especially the dense east coast, without too much open space over the water. I then ran the script for 30 minutes from my local desktop and two different EC2 instances so the resulting bandwidth did not heavily affect the potential throughput. Below you can see the Google AppEngine dashboard for the test time period, we achieved about 60 requests/second during peak load. That’s a little over 100K requests with about 1.5GB of outgoing bandwidth (json, image data). The average response time was from 200-300ms.
The reason AppEngine can manage this load so quickly is that the handlers are small and transient, there is no huge startup time or large memory overheads. No one handler holds any spatial indexes, and each one is capable of running searches on any of your layers. Spatial indices that are held in memory are extremely fast, much faster then what we can accomplish on AppEngine. However, what we trade for in performance (depending on the query, somewhere in the range of 75-100ms) we gain in scalability. Any handler can search any datasource, do it’s work and shut down, only to emerge again on a completely different server (or data center for that matter). To me, this is one of the major draws of PaaS over IaaS, this lightweight nature allows it to scale easily and is perfect for micro-billing of usage.
What’s significant about the chart above is the cost estimate at the bottom. The $0.07 cost is a little deceiving since it is offset by the free daily limit but if you take that out, this setup would probably run about $15/day given you probably won’t be running at full throttle for 24 hours straight. $15 a day is not bad for basic mapping functionality at 60 requests/second. Even better, if you’re a much smaller site running in the 1-2 requests per second (many of our existing clients would fall into this category), there’s a good chance your daily bill will be measured in cents and for those really lucky folks: free.
It’s also important to realize what comes in that price too. Software, hardware, bandwidth and most importantly, the Google IT support staff to keep the whole thing running 24/7. This last one is critical. To run this test, I fired off 3 python scripts and went and got a cup of coffee. No server configuration issues, random database connectivity or firewall problems, just a pleasant silence as all of this was handled by experts somewhere else. That doesn’t mean there won’t be problems, it just means there is a dedicated staff to deal with them. Another important consideration is that these costs are probably in different parts of your budget. However they should all be considered in aggregate when thinking about moving to the Cloud. As much as we programmers (myself included!) want to make Cloud computing about the technology itself , remember at the end of the day, it’s about economies of scale and the dramatic reduction in costs it can deliver. If someone is talking about the Cloud and is not mentioning this, you might want to question their angle.
At this point, you may be asking yourself when will you ever need 60 requests/second of throughput? If you intend to keep the status quo of mapping sites then there is a good chance you will never need this level of capacity. A simple viewer that does not expose hooks for outside users and developers will serve its purpose but will remain static. However, if you want to embrace Web 2.0 and expose your data in a controlled manner, then you may start running into the limits of a single server setup. here are some examples:
- iPhone/Mobile Apps – Build a custom webapp for your data or expose it to other developers. If the App is popular, then your server will almost instantly be swamped with requests
- Vector Streaming – Richer client side apps like Google Earth, Flex/Silverlight and desktop apps that need a continuous stream of data also generate a lot of simultaneous requests
- Wave Robots – Wave robot provide autonomous support to Wave participants. This could be running searches, custom services or map tiles/images for gadgets. The point is that a single Robot will be monitoring traffic and requests from potentially thousands of simultaneous Waves so it absolutely needs to fast and scalable.
- Search Engines – When serving Geo Sitemaps, Search engines have the potential of pushing a ton of traffic to your site. Especially if you expose “preview” resources to the crawlers that point back to the full set of data.