August 19, 2014 Simon Raper

A new home for pifreak

Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

pifreak is my twitterbot. It started tweeting the digits of pi in April 2012 and has tweeted the next 140 digits at 3:14 pm GMT every day since. Not especially useful or popular (only 48 followers) but I’ve grown fond of she/he/it.

Screen Shot 2014-08-19 at 14.27.11

I was housing her on an AWS ec2 micro instance, however my one year of free hire ran out and it has become a little too expensive to keep that box running.

So I’ve been looking at alternatives. I’ve settled on the google app engine which I’m hoping is going to come out as pretty close to free hosting.

So here’s a few notes for anyone else who might be thinking of using the google app engine for automated posting on twitter.

It was reasonably simple to set up

  1. Download the GAE python SDK. This provides a GUI for both testing your code locally and then deploying it to the cloud when you are happy with it.
  2. Create a new folder for your app and within that place your python modules together with an app.yaml file and a cron.yaml file which will configure the application and schedule your task respectively. It’s all very well documented here and for the cron scheduling here.
  3. Open the App Engine Launcher (which is effectively the SDK), add your folder, then either hit run to test locally or deploy to push to the cloud (you’ll be taken to some forms to register your app if you’ve not already done so)
  4. Finally if you click on dashboard from the launcher you’ll get lots of useful information about your web deployed app including error logs and the schedule for your tasks.

The things that caught me out were:

  1. Make sure that the application name in your app.yaml file is the same as the one you register with Google (when it takes you through to the form the first time you deploy.)
  2. There wasn’t a lot in the documentation about the use of the url field in both the cron and app yaml files. I ended up just putting a forward slash in both since in my very simple app the python module is in the root.
  3. Don’t forget module names are case sensitive so when you add your python module in the script section of the app file you’ll need to get this right.
  4. Yaml files follow an indentation protocol that is similar to python. You’ll need to ensure it’s all lined up correctly.
  5. Any third party libraries you need that are not included in this list will need to be included in your app folder. For example I had to include tweepy and some of its dependencies
  6. Where the third party library that you need is included in the GAE runtime environment you need to add it to the app file using the following syntax

    - name: ssl
    version: "latest"

And here finally is a link to the code.

Tagged: , , ,

About the Author

Simon Raper I am an RSS accredited statistician with over 15 years’ experience working in data mining and analytics and many more in coding and software development. My specialities include machine learning, time series forecasting, Bayesian modelling, market simulation and data visualisation. I am the founder of Coppelia an analytics startup that uses agile methods to bring machine learning and other cutting edge statistical techniques to businesses that are looking to extract value from their data. My current interests are in scalable machine learning (Mahout, spark, Hadoop), interactive visualisatons (D3 and similar) and applying the methods of agile software development to analytics. I have worked for Channel 4, Mindshare, News International, Credit Suisse and AOL. I am co-author with Mark Bulling of Drunks and Lampposts - a blog on computational statistics, machine learning, data visualisation, R, python and cloud computing. It has had over 310 K visits and appeared in the online editions of The New York Times and The New Yorker. I am a regular speaker at conferences and events.

Machine Learning and Analytics based in London, UK