June 20, 2012 Mark Bulling

R: Dealing with package updates

Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

Here’s a very short post to highlight one of the “highlights” of my week that I thought was worth sharing with the wider community.

One of the things I find great about R is the rapidly evolving ecosystem where new packages are being constantly created and others are being updated.

Up until now, I’ve found this to be a very good thing, but experienced the other side this week, where an upgrade to a package broke a pretty big script that I’d been working on.   

The “quick” solution in my case was to use the CRAN archive to download the source for earlier versions of the package (and it’s dependencies) that was causing issue and then build and install them to overwrite the upgrade – knowledge of how to build a package from source in Windows came in very handy and you can read more about how to do it in a previous D&L post

The longer term solution is a lot trickier, particularly where R is being used in a collaborative environment and reproducibility is important, either between machines or over time.

I suspect one solution is to have a common library that is centrally maintained and then changing the default location that R looks for installed packages (which is also a handy solution when dealing with not having to download all packages against once you’ve upgraded base R). 

On top of this, I suspect there would then need to be some type of test suite which, once updates to packages were available, checked in a development environment that existing scripts and processes still worked. None of this is new to software and IT folk, but it’s a novel issue at the moment for the analyst community I suspect.

Tagged: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

Machine Learning and Analytics based in London, UK