Google App Engine Tip 1: How to reproduce your production datastore locally

Note: this post refers to the Python version of Google App Engine

Google App Engine provides a developers an efficient platform for building highly scalable applications and deploying to the cloud.  Underpinning the scalability of the datastore is BigTable, a proprietary data storage system that distributes data across many  machines.

Google provides a local development environment for building and testing sites on your machine before deploying to the cloud for production.  This local environment simulates the BigTable datastore, but it's not exactly the same.  So if you need to test your site locally using the data from the production site, you can't simply copy a single file from the cloud to your development machine.

Fortunately, the App Engine SDK provides a Python command file for interacting with App Engine, appcfg.py. You can pull the entire contents of the production datastore to your local machine with a command like this:

appcfg.py download_data --application=s~your-app-id --url=http://your-app-id.appspot.com/_ah/remote_api --filename=backup1

That will create a file on your local machine named backup1.  This is a slow process that may take several minutes to several hours depending on the amount of data, so you'll want to have something else to work on while that command runs. 

When the download is complete, you can load that file to your local development environment with  a command like this:

appcfg.py upload_data --application=dev~your-app-id--url=http://localhost:8000/_ah/remote_api --filename=backup1 --num_threads=1

Once again be patient, depending on the amount of data, this command might also take a long time.  But the reward is worth the wait, because when your finished you'll have an exact copy of your production data, perfect for testing and debugging issues that only pop up on your production data.

Source: http://www.finitewisdom.com/people/phil-pl...