0

I am trying to call different columns of a data.table inside a loop, to get unique values of each column.

Consider the simple data.table below.

> df <- data.table(var_a = rep(1:10, 2),
+                  var_b = 1:20)
> df
    var_a var_b
 1:     1     1
 2:     2     2
 3:     3     3
 4:     4     4
 5:     5     5
 6:     6     6
 7:     7     7
 8:     8     8
 9:     9     9
10:    10    10
11:     1    11
12:     2    12
13:     3    13
14:     4    14
15:     5    15
16:     6    16
17:     7    17
18:     8    18
19:     9    19
20:    10    20

My code works when I call for a specific column outside a loop,

> unique(df$var_a)
 [1]  1  2  3  4  5  6  7  8  9 10
> unique(df[, var_a])
 [1]  1  2  3  4  5  6  7  8  9 10
> unique(df[, "var_a"])
    var_a
 1:     1
 2:     2
 3:     3
 4:     4
 5:     5
 6:     6
 7:     7
 8:     8
 9:     9
10:    10

but not when I do so within a loop that goes through different columns of the data.table.

> for(v in c("var_a","var_b")){
+   print(v)
+   df$v
+   unique(df[, .v])
+   unique(df[, "v"])
+ }
[1] "var_a"
Error in `[.data.table`(df, , .v) : 
  j (the 2nd argument inside [...]) is a single symbol but column name '.v' is not found. Perhaps you intended DT[, ...v]. This difference to data.frame is deliberate and explained in FAQ 1.1.
> 
> unique(df[, ..var_a])
Error in `[.data.table`(df, , ..var_a) : 
  Variable 'var_a' is not found in calling scope. Looking in calling scope because you used the .. prefix.
1
  • apply(df, 2, unique) ? Commented Feb 11, 2023 at 12:40

4 Answers 4

1

For the first problem, when you're referencing a column name indirectly, you can either use double-dot ..v syntax, or add with=FALSE in the data.table::[ construct:

for (v in c("var_a", "var_b")) {
  print(v)
  print(df$v)
  ### either one of these will work:
  print(unique(df[, ..v]))
  # print(unique(df[, v, with = FALSE]))
}
# [1] "var_a"
# NULL
#     var_a
#     <int>
#  1:     1
#  2:     2
#  3:     3
#  4:     4
#  5:     5
#  6:     6
#  7:     7
#  8:     8
#  9:     9
# 10:    10
# [1] "var_b"
# NULL
#     var_b
#     <int>
#  1:     1
#  2:     2
#  3:     3
#  4:     4
#  5:     5
#  6:     6
#  7:     7
#  8:     8
#  9:     9
# 10:    10
# 11:    11
# 12:    12
# 13:    13
# 14:    14
# 15:    15
# 16:    16
# 17:    17
# 18:    18
# 19:    19
# 20:    20
#     var_b

But this just prints it without changing anything. If all you want to do is look at unique values within each column (and not change the underlying frame), then I'd likely go with

lapply(df[,.(var_a, var_b)], unique)
# $var_a
#  [1]  1  2  3  4  5  6  7  8  9 10
# $var_b
#  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

which shows the name and unique values. The use of lapply (whether on df as a whole or a subset of columns) is also preferable to another recommendation to use apply(df, 2, unique), though in this case it returns the same results.

Sign up to request clarification or add additional context in comments.

Comments

1

Use .subset2 to refer to a column by its name:

for(v in c("var_a","var_b")) {
  print(unique(.subset2(df, v)))
}

Comments

0

following the information on the first error, this would be the correct way to call in a loop:

for(v in c("var_a","var_b")){

    print(unique(df[, ..v]))

}
# won't print all the lines

as for the second error you have not declared a variable called "var_a", it looks like you want to select by name.

# works as you have shown
unique(df[, "var_a"])

# works once the variable is declared
var_a <- "var_a"
unique(df[, ..var_a])

Comments

0

You may also be interested in the env param of data.table (see development version); here is an illustration below, but you could use this in a loop too.

v="var_a"
df[, v, env=list(v=v)]

Output:

 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.