November 8, 2013 Simon Raper

Box Me

Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

Here’s a short R function I wrote to turn a long data set into a wide one for viewing. It’s not the most exciting function ever but I find it quite useful when my screen is wide and short. It simply cuts the data set horizontally into equal size pieces and puts them side by side. Lazy I know!

#'boxMe
#'
#'Turns an overly long data frame into something easier to look at
#'
#' @param d A dataframe or matrix
#' @param nrow The number of rows you would like to see in the new dataframe
#' @examples
#' test.set<-data.frame(x=rnorm(100), y=rnorm(100))
#' boxMe(test.set, 18)
#'
#' library(ggplot2)
#' boxMe(diamonds, 10)
boxMe<-function(d, nrow){
  # Number of rows and columns
  r<-dim(d)[1]
  c<-dim(d)[2]
  rem<-r %% nrow # Number of blank rows
  reps<-floor(r/nrow) # Number of folds
  s<-seq(1, reps*nrow, by=nrow) # Breaks
  box<-d[1:nrow,] # First col
  for (i in s[-1]){
    ap<-d[i:(i+nrow-1),]
    box<-cbind(box, ap)
  }
  #Append remainder
  if (rem>0){
    n.null.rows<-nrow-rem
    rem.rows<-d[(reps*nrow+1):r,]
    null.block<-as.data.frame(matrix(rep(NA, (n.null.rows*c)), nrow=n.null.rows))
    names(null.block)<-names(rem.rows)
    last.block<-rbind(rem.rows, null.block)
    box<-cbind(box, last.block)
  }
  return(box)
}
Tagged:

About the Author

Simon Raper I am an RSS accredited statistician with over 15 years’ experience working in data mining and analytics and many more in coding and software development. My specialities include machine learning, time series forecasting, Bayesian modelling, market simulation and data visualisation. I am the founder of Coppelia an analytics startup that uses agile methods to bring machine learning and other cutting edge statistical techniques to businesses that are looking to extract value from their data. My current interests are in scalable machine learning (Mahout, spark, Hadoop), interactive visualisatons (D3 and similar) and applying the methods of agile software development to analytics. I have worked for Channel 4, Mindshare, News International, Credit Suisse and AOL. I am co-author with Mark Bulling of Drunks and Lampposts - a blog on computational statistics, machine learning, data visualisation, R, python and cloud computing. It has had over 310 K visits and appeared in the online editions of The New York Times and The New Yorker. I am a regular speaker at conferences and events.

Comment (1)

Leave a Reply

Your email address will not be published. Required fields are marked *

Machine Learning and Analytics based in London, UK