255 questions
34
votes
2
answers
29k
views
Understanding color scales in ggplot2
There are so many ways to define colour scales within ggplot2. After just loading ggplot2 I count 22 functions beginging with scale_color_* (or scale_colour_*) and same number beginging with ...
141
votes
5
answers
25k
views
What are the differences between R's native pipe `|>` and the magrittr pipe `%>%`?
In R 4.1 (May 2021) a native pipe operator was introduced that is "more streamlined" than previous implementations. I already noticed one difference between the native |> and the magrittr ...
16
votes
1
answer
5k
views
How to order data by value within ggplot facets
I have the following data frame:
library(tidyverse)
tdat <- structure(list(term = c("Hepatic Fibrosis / Hepatic Stellate Cell Activation",
"Cellular Effects of Sildenafil (Viagra)", "Epithelial ...
67
votes
3
answers
197k
views
How to debug "contrasts can be applied only to factors with 2 or more levels" error?
Here are all the variables I'm working with:
str(ad.train)
$ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ...
$ Team ...
21
votes
1
answer
3k
views
What are primitive, internal, builtin, and special functions? [closed]
I have seen that some functions that call C-code are described as primitive, internal, builtin, or special. What are these functions?
113
votes
2
answers
263k
views
What does "The following object is masked from 'package:xxx'" mean?
When I load a package, I get a message stating that:
"The following object is masked from 'package:xxx'
For example, if I load testthat then assertive, I get the following:
library(testthat)
library(...
30
votes
7
answers
80k
views
R: How to split a data frame into training, validation, and test sets?
I'm using R to do machine learning. Following standard machine learning methodology, I would like to randomly split my data into training, validation, and test data sets. How do I do that in R?
I ...
19
votes
1
answer
4k
views
Issue when passing variable with dollar sign notation ($) to aes() in combination with facet_grid() or facet_wrap()
I am doing some analysis in ggplot2 at the moment for a project and by chance I stumbled across some (for me) weird behavior that I cannot explain. When I write aes(x = cyl, ...) the plot looks ...
134
votes
10
answers
107k
views
Get filename without extension in R
I have a file:
ABCD.csv
The length before the .csv is not fixed and vary in any length.
How can I extract the portion before the .csv?
67
votes
1
answer
12k
views
How and when should I use on.exit?
on.exit calls code when a function exits, but how and when should I use it?
7
votes
4
answers
67k
views
What is the difference between = and == in R?
What is the difference between = and ==? I have found cases where the double equal sign will allow my script to run while one equal sign produces an error message. When should I use == instead of =?
45
votes
5
answers
338k
views
What does "Error: object '<myvariable>' not found" mean?
I got the error message:
Error: object 'x' not found
Or a more complex version like
Error in mean(x) :
error in evaluating the argument 'x' in selecting a method for function 'mean': ...
45
votes
1
answer
87k
views
Error in <my code> : target of assignment expands to non-language object
I received the error
Error in <my code> : target of assignment expands to non-language object
or
Error in <my code> : invalid (do_set) left-hand side to assignment
or
Error in <my ...
213
votes
7
answers
694k
views
What does %>% function mean in R?
I have seen the use of %>% (percent greater than percent) function in some packages like dplyr and rvest. What does it mean? Is it a way to write closure blocks in R?
18
votes
2
answers
12k
views
Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?
Note: The title of this question has been edited to make it the canonical question for issues when plyr functions mask their dplyr counterparts. The rest of the question remains unchanged.
Suppose I ...
300
votes
10
answers
242k
views
Use dynamic name for new column/variable in `dplyr`
I want to use dplyr::mutate() to create multiple new columns in a data frame. The column names and their contents should be dynamically generated.
Example data from iris:
library(dplyr)
iris <- ...
723
votes
19
answers
1.1m
views
How should I deal with "package 'xxx' is not available (for R version x.y.z)" warning?
I tried to install a package, using
install.packages("foobarbaz")
but received the warning
Warning message:
package 'foobarbaz' is not available (for R version x.y.z)
Why doesn't R think that the ...
163
votes
18
answers
206k
views
Select the row with the maximum value in each group
In a dataset with multiple observations for each subject. For each subject I want to select the row which have the maximum value of 'pt'. For example, with a following dataset:
ID <- c(1,1,1,2,2,...
12
votes
1
answer
1k
views
Order of operator precedence when using ":" (the colon)
I am trying to extract values from a vector using numeric vectors expressed in two seemingly equivalent ways:
x <- c(1,2,3)
x[2:3]
# [1] 2 3
x[1+1:3]
# [1] 2 3 NA
I am confused why the ...
71
votes
3
answers
66k
views
How to deal with nonstandard column names (white space, punctuation, starts with numbers)
df <- structure(list(`a a` = 1:3, `a b` = 2:4), .Names = c("a a", "a b"
), row.names = c(NA, -3L), class = "data.frame")
and the data looks like
a a a b
1 1 2
2 2 3
3 3 4
Following ...
38
votes
3
answers
30k
views
How to generate permutations or combinations of object in R?
How to generate sequences of r objects from n objects? I'm looking for a way to do either permutations or combinations, with/without replacement, with distinct and non-distinct items (aka multisets).
...
287
votes
2
answers
161k
views
What are the main differences between R data files?
What are the main differences between .RData, .Rda and .Rds files?
Are there differences in compression, etc.?
When should each type be used?
How can one type be converted to another?
127
votes
21
answers
311k
views
Rename multiple columns by names
Someone should have asked this already, but I couldn't find an answer. Say I have:
x = data.frame(q=1,w=2,e=3, ...and many many columns...)
what is the most elegant way to rename an arbitrary ...
39
votes
4
answers
29k
views
Subset data frame based on number of rows per group
I have data like this, where some "name" occurs more than three times:
df <- data.frame(name = c("a", "a", "a", "b", "b", "c", "c", "c", "c"), x = 1:9)
name x
1 a 1
2 a 2
3 a 3
4 b ...
69
votes
10
answers
52k
views
Cleaning up factor levels (collapsing multiple levels/labels)
What is the most effective (ie efficient / appropriate) way to clean up a factor containing multiple levels that need to be collapsed? That is, how to combine two or more factor levels into one.
Here'...