2

I'd like to create a bar chart using factors and more than two variables! My data looks like this:

     Var1 Var2 ... VarN Factor1 Factor2
Obs1  1-5 1-5  ... 1-5     
Obs2  1-5 1-5  ... ...
Obs3  ... ...  ... ...

Each datapoint is a likert item ranging from 1-5

Plotting total sums using a dichotomized version (every item above 4 is a one, else 0)

I converted the data using this

MyDataFrame = dichotomize(MyDataFrame,>=4)
p <- colSums(MyDataFrame)
p <- data.frame(names(p),p)
names(p) <- c("var","value")
ggplot(p,aes(var,value)) + geom_bar() + coord_flip()

enter image description here

Doing this i loose the information provided by factor1 etc, i'd like to use stacking in order to visualize from which group of people the rating came

Is there a elegant solution to this problem? I read about using reshape to melt the data and then applying ggplot?

1
  • Yes, essentially reshape is your friend. You want one variable with the result and one variable with the label for that result. Commented Feb 5, 2012 at 22:03

1 Answer 1

4

I would suggest the following: use one of your factor for stacking, the other one for faceting. You can remove position="fill" to geom_bar() to use counts instead of standardized values.

my.df <- data.frame(replicate(10, sample(1:5, 100, rep=TRUE)), 
                    F1=gl(4, 5, 100, labels=letters[1:4]), 
                    F2=gl(2, 50, labels=c("+","-")))
my.df[,1:10] <- apply(my.df[,1:10], 2, function(x) ifelse(x>4, 1, 0))
library(reshape2)
my.df.melt <- melt(my.df)
library(plyr)
res <- ddply(my.df.melt, c("F1","F2","variable"), summarize, sum=sum(value))
library(ggplot2)
ggplot(res, aes(y=sum, x=variable, fill=F1)) +
   geom_bar(stat="identity", position="fill") + 
   coord_flip() +
   facet_grid(. ~ F2) + 
   ylab("Percent") + xlab("Item")

enter image description here

In the above picture, I displayed observed frequencies of '1' (value above 4 on the Likert scale) for each combination of F1 (four levels) and F2 (two levels), where there are either 10 or 15 observations:

> xtabs(~ F1 + F2, data=my.df)
   F2
F1   +  -
  a 15 10
  b 15 10
  c 10 15
  d 10 15

I then computed conditional item sum scores with ddply, using a 'melted' version of the original data.frame. I believe the remaining graphical commands are highly configurable, depending on what kind of information you want to display.

In this simplified case, the ddply instruction is equivalent to with(my.df.melt, aggregate(value, list(F1=F1, F2=F2, variable=variable), sum)).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.