12

I have a dataframe:

>picard
count    reads
 1    20681318
 2     3206677
 3      674351
 4      319173
 5      139411
 6      117706

How do I plot log10(count) vs log10(reads) on a ggplot (barplot)?

I tried:

ggplot(picard) + geom_bar(aes(x=log10(count),y=log10(reads)))

But it is not accepting y=log10(reads). How do I plot my y values?

3 Answers 3

14

You can do something like this, but plotting the x axis, which is not continuous, with a log10 scale doesn't make sense for me :

ggplot(picard) +
    geom_bar(aes(x=count,y=reads),stat="identity") +
    scale_y_log10() +
    scale_x_log10()

enter image description here

If you only want an y axis with a log10 scale, just do :

ggplot(picard) +
    geom_bar(aes(x=count,y=reads),stat="identity") +
    scale_y_log10()

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

Is there a way to increase the width of the bars?
@user2703967 Yes, use the width argument of geom_bar().
The graph is very staggered because of the data. Can I use a geom_density() to plot these values? I would probably need to set my own density but I dont know how to do that.
@user2703967, Please think carefully about what you really wish to achieve before people start spending their time trying to help you. A density plot is quite different from a bar plot.
The thing is, because the data is so varied, the barplot is very staggered. Which is why I want to plot a density plot so that it gives me a smooth line but essentially the shape would be same as the barplot.
6

Use stat="identity":

ggplot(picard) + geom_bar(aes(x=log10(count),y=log10(reads)), stat="identity")

You will actually get a warning with your approach:

Mapping a variable to y and also using stat="bin". With stat="bin", it will attempt to set the y value to the count of cases in each group. This can result in unexpected behavior and will not be allowed in a future version of ggplot2. If you want y to represent counts of cases, use stat="bin" and don't map a variable to y. If you want y to represent values in the data, use stat="identity". See ?geom_bar for examples. (Deprecated; last used in version 0.9.2)

Comments

2

There's a direct way to do this, i.e. by using the geom_col() function. Just make a tiny adjustment to your code:

ggplot(picard) + geom_col(aes(x=log10(count), y=log10(reads)))

and it will give the same output as setting the stat argument to identity with geom_bar(). The thing is, geom_bar() uses count as default for stat, hence it will not take any variable for the y-axis. It will simply use the count, i.e, the number of occurrences of each value of the x-axis, for it's y-axis. I hope this answers your question.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.