Voter Relationship Management

Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

Customer Relationship Management (CRM) seems to be coming into the mainstream, with the New York Times recently reporting how Target has used such analytics to identify expectant mothers based on their shopping habits and was then able to target them appropriately with special offers and vouchers.

As the 2012 US Election approaches, it seems that data analysis is coming of age, being increasingly used to target voters on a scale not seen before. Credited in part for Obama‘s win in 2008, where voters were profiled and segmented, just as advertisers tend to segment and cluster their clients based on behaviours, demographics and attitudes.

The growth of Facebook, Twitter and the like since 2008 have added a new dimension to what was a fairly static dataset and which shied away from the behaviours dimension. Adding this newly available dimension brings massive new opportunities for market research and targeting. The reaction to a new ad can be evaluated in real time and A/B testing can help to pick out the messages that work.

Obama’s Data Crunchers

There have been a few interesting pieces about how Obama’s re-election campaign are using methods more commonly associated with consumer marketing to target supporters and voters. This piece in the New York Times gives an overview of some of the team behind the analysis, which includes Rayid Ghani who was previously at Accenture Technology Lab and who has written extensively on Data and Text Mining.

Social CRM seems to be one of the growing areas of buzz to come up with a “holistic view of the customer”, with several players claiming the ability to be able to join a users various online accounts together (e.g. Facebook, Twitter, LinkedIn) together in one place to give a single view, so called “Social Identity Mapping”. How well this works is still up for debate; Infochimps offers an API with this capability and the results seem to be biased towards the more socially savvy.

A recent set of donation raising emails show how the set of data and analysts are starting to put this to use with emails tailored to the individual, and doubtlessly also has a large test and learn component to it, where the emails that yield the highest response are then used more widely. At the same time, Google and Facebook are coining it in through serving ads based on what people say and do online.

Facebook and Politico

Facebook has made its first foray into publishing insight from its data collecting machine, aggregating up individual’s wall posts and status updates to report back on the Republican primaries and how the various candidates are performing, the results are then being published by Politico. This seemed to generate a lot more buzz at the time it was announced than the ongoing analysis of the campaigns. I suspect this is a dry run ahead of November.

And just to be clear, Facebook isn’t handing over users’ information to the Republicans! 

Book Review: The Art of R Programming

Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

Over the past few months, there have been several glowing reviews of The Art of R Programming by Norman Matloff, which led me to give it a go and buy it for my Kindle. I’ve just finished reading it and am adding to that list of glowing reviews. 

Having been using R in varying amounts over the past three years, my approach has typically been task orientated, learning as I go and using resources like the R Mailing List, Stack Overflow, Google and the vignettes and help files. This has sometimes led to banging my head against a wall for several hours, but ultimately I’ve learnt quite a bit about the language. 

As I move more into writing functions, building packages and thinking about developing GUI widgets that sit on top of R, this has meant understanding more about the different data structures in R and starting to get an appreciation for things like S3 and S4 classes.

I’ve found reading the book a very enjoyable experience – to summarise, it is well written, builds the foundations from the ground up and provides good examples throughout to really illustrate the ideas and concepts that are being discussed.

The early chapters look at the various data classes in R, starting off with Vectors, before moving into Matrices, Arrays, Lists and Data Frames, giving good insights into all of these and an appreciation of when they can be used. The middle chapters focus on more advanced topics like writing functions, running simulations, string manipulation and graphics. Later chapters focus on advanced topics like parallelisation and debugging.

There is a big emphasis on the benefit of writing efficient code, either from a speed perspective or from a memory allocation perspective, and it really highlights where vectorisation can be used to solve a problem. I’m now a better programmer in R than I was before using it and am applying the concepts that I’ve learnt day in, day out.

Although still getting my head around some of the more advanced concepts in the book (it will be a while until I’m writing my own C code to improve the performance of an R function!), it has worked well as both front to back read and will work well in the future as a reference book. If you’re in a similar boat to me, having learnt R in a relatively unstructured way, I’d highly recommend it – at £16 for the Amazon Kindle edition, it’s great value for money.

Machine Learning and Analytics based in London, UK