geom_segment variable length error

Question

The following code worked in ggplot2 before I updated to version 2.2.0. Now I get Error: Aesthetics must be either length 1 or the same as the data (30): x, y, xend, yend. The error is caused by the two geom_segment calls.

drug1 <- c(.7, -1.6, -.2, -1.2, -.1, 3.4, 3.7, .8, 0, 2)
drug2 <- c(1.9, .8, 1.1, .1, -.1, 4.4, 5.5, 1.6, 4.6, 3.4)
d <- data.frame(Drug=c(rep('Drug 1', 10), rep('Drug 2', 10),
                  rep('Difference', 10)),
                extra=c(drug1, drug2, drug2 - drug1))

ggplot(d, aes(x=Drug, y=extra)) + 
  geom_boxplot(col='lightyellow1', alpha=.3, width=.5) + 
  geom_dotplot(binaxis='y', stackdir='center', position='dodge') +
  stat_summary(fun.y=mean, geom="point", col='red', shape=18, size=5) +
  geom_segment(aes(x=rep('Drug 1', 30), xend=rep('Drug 2', 30), y=drug1, yend=drug2),
               col=gray(.8)) +
  geom_segment(aes(x='Drug 1', xend='Difference', y=drug1, yend=drug2 - drug1),
               col=gray(.8)) +
  xlab('') + ylab('Extra Hours of Sleep') + coord_flip()

Update: Improved code that works:

drug1 <- c(.7, -1.6, -.2, -1.2, -.1, 3.4, 3.7, .8, 0, 2)
drug2 <- c(1.9, .8, 1.1, .1, -.1, 4.4, 5.5, 1.6, 4.6, 3.4)
d <- data.frame(Drug=c(rep('Drug 1', 10), rep('Drug 2', 10),
                  rep('Difference', 10)),
                extra=c(drug1, drug2, drug2 - drug1))
w <- data.frame(drug1, drug2, diff=drug2 - drug1)

ggplot(d, aes(x=Drug, y=extra)) +
  geom_boxplot(col='lightyellow1', alpha=.3, width=.5) + 
  geom_dotplot(binaxis='y', stackdir='center', position='dodge') +
  stat_summary(fun.y=mean, geom="point", col='red', shape=18, size=5) +
  geom_segment(data=w, aes(x='Drug 1', xend='Drug 2', y=drug1, yend=drug2),
               col=gray(.8)) +
  geom_segment(data=w, aes(x='Drug 1', xend='Difference', y=drug1, yend=drug2 - drug1),
               col=gray(.8)) +
  xlab('') + ylab('Extra Hours of Sleep') + coord_flip()

Your drug1 and drug2 are both length 10 . try y=rep(drug1, 3) and yend=rep(drug2, 3)) (I also think it would be nicer to add these to a second data frame rather than leaving ggplot to look in the global env) — user2957945
– user2957945, Commented Dec 10, 2016 at 18:46
Excellent. I'm improving the code as you suggest, in the original posting. — Frank Harrell
– Frank Harrell, Commented Dec 10, 2016 at 21:44
@FrankHarrell so I understand the context, are drug1 and drug2 paired values (e.g. associated with the same subject)? — davechilders
– davechilders, Commented Dec 10, 2016 at 21:55

davechilders · Accepted Answer · 2016-12-10 22:10:31Z

1

The updated version of the code produces a data-frame d, that looks like this:

drug1 <- c(.7, -1.6, -.2, -1.2, -.1, 3.4, 3.7, .8, 0, 2)
drug2 <- c(1.9, .8, 1.1, .1, -.1, 4.4, 5.5, 1.6, 4.6, 3.4)
d <- data.frame(Drug=c(rep('Drug 1', 10), rep('Drug 2', 10),
                  rep('Difference', 10)),
                extra=c(drug1, drug2, drug2 - drug1))

> d
         Drug extra
1      Drug 1   0.7
2      Drug 1  -1.6
3      Drug 1  -0.2
4      Drug 1  -1.2
5      Drug 1  -0.1
6      Drug 1   3.4
7      Drug 1   3.7
8      Drug 1   0.8
9      Drug 1   0.0
10     Drug 1   2.0
11     Drug 2   1.9
12     Drug 2   0.8
13     Drug 2   1.1
14     Drug 2   0.1
15     Drug 2  -0.1
16     Drug 2   4.4
17     Drug 2   5.5
18     Drug 2   1.6
19     Drug 2   4.6
20     Drug 2   3.4
21 Difference   1.2
22 Difference   2.4
23 Difference   1.3
24 Difference   1.3
25 Difference   0.0
26 Difference   1.0
27 Difference   1.8
28 Difference   0.8
29 Difference   4.6
30 Difference   1.4

This is a problematic way to create the data-frame for two reasons:

The variables drug1 and drug2 exist in both the global environment and within the data.frame d. This creates the potential for confusion, masking, and other errors.
The only way Difference is tied to the values that produced the difference is the row ordering. For instance, the values in row 1 and row 11 produced the difference in row 21. This can create problems if you do any later modification of the data set.

I would suggest creating the data-frame in a manner like this:

d2 <- data.frame(
  pair = 1:10,
  drug1 = c(.7, -1.6, -.2, -1.2, -.1, 3.4, 3.7, .8, 0, 2),
  drug2 = c(1.9, .8, 1.1, .1, -.1, 4.4, 5.5, 1.6, 4.6, 3.4)
) 

   pair drug1 drug2
1     1   0.7   1.9
2     2  -1.6   0.8
3     3  -0.2   1.1
4     4  -1.2   0.1
5     5  -0.1  -0.1
6     6   3.4   4.4
7     7   3.7   5.5
8     8   0.8   1.6
9     9   0.0   4.6
10   10   2.0   3.4

There is an explicit pair variable that links the values, and no extra copies of drug1 and drug2 exist outside of d2.

You can then use tidyr to convert to tidy/long format (for nice use with ggplot and modeling packages):

tidyr::gather(d2, drug, value, drug1, drug2)

   pair  drug value
1     1 drug1   0.7
2     2 drug1  -1.6
3     3 drug1  -0.2
4     4 drug1  -1.2
5     5 drug1  -0.1
6     6 drug1   3.4
7     7 drug1   3.7
8     8 drug1   0.8
9     9 drug1   0.0
10   10 drug1   2.0
11    1 drug2   1.9
12    2 drug2   0.8
13    3 drug2   1.1
14    4 drug2   0.1
15    5 drug2  -0.1
16    6 drug2   4.4
17    7 drug2   5.5
18    8 drug2   1.6
19    9 drug2   4.6
20   10 drug2   3.4

answered Dec 10, 2016 at 22:10

davechilders

9,1932 gold badges22 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Frank Harrell Over a year ago

I see what you are getting at but that is a very long way to do it in my humble opinion. A separate package should not be needed for this application. I prefer the original but you are right it would be better to not have variables hanging around in the global environment. The improved code I posted is not confused by these global variables though.

davechilders Over a year ago

When I look at the updated section of your original post, the two issues still remain. Not having an explicit variable like pair could easily lead to problems.

Frank Harrell Over a year ago

We'll have to agree to disagree on that point. I don't need pair when I have direct access to the two paired measurements.

Collectives™ on Stack Overflow

geom_segment variable length error

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related