29

I am using following commands to produce a scatterplot with jitter:

ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
library(lattice)
stripplot(NUMS~GRP,data=ddf, jitter.data=T)

I want to add boxplots over these points (one for every group). I tried searching but I am not able to find code plotting all points (and not just outliers) and with jitter. How can I solve this. Thanks for your help.

4
  • 1
    Does it have to be lattice? Otherwise try sth like with(ddf, { boxplot(NUMS~GRP); points(jitter(as.numeric(GRP)), NUMS, col=rgb(0,0,0,.2), cex=.5, pch=19) }). Commented May 15, 2014 at 11:25
  • Using base graphics is preferred. Your solution works very well. Thanks. Commented May 15, 2014 at 11:55
  • Can this be done with ggplot2? I tried {ggplot(ddf,aes(x=GRP, y=NUMS))+geom_boxplot()+geom_jitter()} but it produces too much scatter- could the jitter be less? Commented May 15, 2014 at 15:49
  • See this related question as well for points jittered by group: stackoverflow.com/questions/21468380/… Commented Jul 11, 2016 at 0:04

4 Answers 4

48

Here's one way using base graphics.

boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
stripchart(NUMS ~ GRP, vertical = TRUE, data = ddf, 
    method = "jitter", add = TRUE, pch = 20, col = 'blue')

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, it works very well. Thanks. I was trying stripplot followed by boxplot and it was not working.
The add = TRUE argument is key. :)
add=T alone may not be enough since {stripplot(NUMS~GRP,data=ddf, jitter=T) ; boxplot(NUMS~GRP,data=ddf, add=T)} does not work; apparently one needs to put a 'plot' first followed by points or chart.
stripplot is in lattice. stripchart is a base graphics function.
Many years of programming in R and I didn't know this stripchart function from R base. Very good!
25

To do this in ggplot2, try:

ggplot(ddf, aes(x=GRP, y=NUMS)) + 
  geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
  geom_jitter(position=position_jitter(width=.1, height=0))

ggplot2 version of boxplot + jitter

Obviously you can adjust the width and height arguments of position_jitter() to your liking (although I'd recommend height=0 since height jittering will make your plot inaccurate).

Comments

3

I've written an R function called spreadPoints() within a package basiclotteR. The package can be directly installed into your R library using the following code:

install.packages("devtools")
library("devtools")
install_github("JosephCrispell/basicPlotteR")

For the example provided, I used the following code to generate the example figure below.

ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))

boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')

spreadPointsMultiple(data=ddf, responseColumn="NUMS", categoriesColumn="GRP",
                     col="blue", plotOutliers=TRUE)

enter image description here

It is a work in progress (the lack of formula as input is clunky!) but it provides a non-random method to spread points on the X axis that doubles as a violin like summary of the data. Take a look at the source code, if you're interested.

2 Comments

Looks good. Is it possible to plot all groups with just one line of code rather than repeating code for each group: spreadPoints(ddf[ddf$GRP=="A", "NUMS"], position=1, col="blue", plotOutliers=TRUE) ?
@rnso I've created an additional function spreadPointsMultiple() that can spread the points for multiple boxplots with a single command (see edit above). I'm currently working on allowing spreadPoints() to have a formula as its first argument. Thanks for pointing this out :-)
2

For a lattice solution:

library(lattice)
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5], 500, replace = T))
bwplot(NUMS ~ GRP, ddf, panel = function(...) {
  panel.bwplot(..., pch = "|")
  panel.xyplot(..., jitter.x = TRUE)})

The default median dot symbol was changed to a line with pch = "|". Other properties of the box and whiskers can be adjusted with box.umbrella and box.rectangle through the trellis.par.set() function. The amount of jitter can be adjusted through a variable named factor where factor = 1.5 increases it by 50%.

lattice solution to boxplot with scatter

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.