Plotting multiple variables via ggplot2

Question

I'd like to create a bar chart using factors and more than two variables! My data looks like this:

     Var1 Var2 ... VarN Factor1 Factor2
Obs1  1-5 1-5  ... 1-5     
Obs2  1-5 1-5  ... ...
Obs3  ... ...  ... ...

Each datapoint is a likert item ranging from 1-5

Plotting total sums using a dichotomized version (every item above 4 is a one, else 0)

I converted the data using this

MyDataFrame = dichotomize(MyDataFrame,>=4)
p <- colSums(MyDataFrame)
p <- data.frame(names(p),p)
names(p) <- c("var","value")
ggplot(p,aes(var,value)) + geom_bar() + coord_flip()

enter image description here

Doing this i loose the information provided by factor1 etc, i'd like to use stacking in order to visualize from which group of people the rating came

Is there a elegant solution to this problem? I read about using reshape to melt the data and then applying ggplot?

Yes, essentially reshape is your friend. You want one variable with the result and one variable with the label for that result. — PaulHurleyuk
– PaulHurleyuk, Commented Feb 5, 2012 at 22:03

chl · Accepted Answer · 2012-02-06 09:55:02Z

I would suggest the following: use one of your factor for stacking, the other one for faceting. You can remove position="fill" to geom_bar() to use counts instead of standardized values.

my.df <- data.frame(replicate(10, sample(1:5, 100, rep=TRUE)), 
                    F1=gl(4, 5, 100, labels=letters[1:4]), 
                    F2=gl(2, 50, labels=c("+","-")))
my.df[,1:10] <- apply(my.df[,1:10], 2, function(x) ifelse(x>4, 1, 0))
library(reshape2)
my.df.melt <- melt(my.df)
library(plyr)
res <- ddply(my.df.melt, c("F1","F2","variable"), summarize, sum=sum(value))
library(ggplot2)
ggplot(res, aes(y=sum, x=variable, fill=F1)) +
   geom_bar(stat="identity", position="fill") + 
   coord_flip() +
   facet_grid(. ~ F2) + 
   ylab("Percent") + xlab("Item")

enter image description here

In the above picture, I displayed observed frequencies of '1' (value above 4 on the Likert scale) for each combination of F1 (four levels) and F2 (two levels), where there are either 10 or 15 observations:

> xtabs(~ F1 + F2, data=my.df)
   F2
F1   +  -
  a 15 10
  b 15 10
  c 10 15
  d 10 15

I then computed conditional item sum scores with ddply,^† using a 'melted' version of the original data.frame. I believe the remaining graphical commands are highly configurable, depending on what kind of information you want to display.

_{^† In this simplified case, the ddply instruction is equivalent to with(my.df.melt, aggregate(value, list(F1=F1, F2=F2, variable=variable), sum)).}

Collectives™ on Stack Overflow

Plotting multiple variables via ggplot2

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related