Tweet about this on TwitterShare on LinkedInShare on FacebookGoogle+Share on StumbleUponEmail to someone

This is a very quick post just to share a quick tip on how to add non overlapping labels to a scatterplot in ggplot using a great package called directlabels. The trick is to make each point a single member group using an aesthetic like colour and then apply the direct.label function with the first.qp method. Some example code and output is below

library(ggplot2)
library(directlabels)
x<-runif(10)
y<-rnorm(10)
z<-as.character(midwest$county[1:10])
q<-qplot(x,y)+geom_point(aes(colour=z))
direct.label(q, first.qp)

If there are better ways then I’d love to know but it works well for me and has the added advantage that the labels are matched to the points by colour.

Comments (8)

  1. Tyler Rinker

    I too have been annoyed by this and am glad to see someone has tackled the problem. I want to try your code but it doesn’t appear to be standard R code. If I try to run it I get an error saying x not found.

    • Thanks for spotting this. For some reason WordPress converted the < sign when I pasted the code in from Rstudio. Should be fixed now but will double check when I'm back in work. It doesn't look like the correction goes immediately onto the R bloggers site if you came from there so you'd have to visit the actual blog

      Cheers

      Simon

      • Nice post.

        FYI, WordPress tends to mangle your code if you switch between the Visual and HTML views. Also, it seems to have an algorithm to check that what you’ve written is suitably important, and not saved anywhere else before deciding to randomly replace parts of your post with HTML.

        It’s interesting to see what happens with direct labels when you stress test it by making it try to draw more labels than is possible (because there isn’t enough room).

        p <- ggplot(midwest, aes(area, poptotal, colour = county)) +
        geom_point() +
        scale_y_log10()
        (p <- direct.label(p, first.qp))

        In this case it appears to just make up locations, then gives up once it fails to find a position. (Compare, for example the label for MARINETTE, with its data subset(midwest, county == "MARINETTE").)

        Using a different algorithm can yield radically different results. If you swap first.qp for first.bumpup, then everything gets labelled, even if the labels overlap.

        In practice, you'll likely have to try a few labelling algorithms to see which one is most effective.

        • Thanks Richie. That’s really useful. I think you are right about experimentation. I started using some of the labelling strategies used recommended for scatterplots but they they didn’t work so well. I assume this is because they are geared at labelling clusters of points. The method I used and the one that you suggested are both recommended for line plots so I guess the points are treated as lines of just one point and the label is positioned accordingly. You’ve probably already seen it but there’s a useful list at http://directlabels.r-forge.r-project.org/docs/index.html and I believe you can also create your own.

  2. Roey Angel

    Hi Simon,
    Great little hack.
    I have a question though.
    As I see from your example direct.label() takes it label values from the catergories of z.
    What I were to have two different aesthetics (colour and zise) or two different geoms. How would you set direct.label() to use only one of them?

    To be more specific my problem at the moment is that my labels from geom_text() overlap each other and I’d like to have them moved around. I assumed the argument position_dodge would solve it but apparently that’s not what it’s for.

    Thanks
    Roey

    • Hi Roey,

      To be honest I’m not that sure. You’d have to check out the documentation for the directLabels package and play with it a bit. I believe the package is more geared towards labelling groups rather than individual points so I kind of fudged it a bit by ensuring each of my points is treated as a separate group. Would be interested if you find out anything interesting.

      Cheers

      Simon

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Machine Learning and Analytics based in London, UK