4

so if I have a data.table defined as:

> dt <- data.table (x=c(1,2,3,4), y=c("y","n","y","m"), z=c("pickle",3,8,"egg"))

    > dt
        x   y        z 
    1:  1   y   pickle
    2:  2   n        3
    3:  3   y        8
    4:  4   m      egg

And a variable

    fn <- "z"

I get that I can pull a column from the data.table by the following:

    > dt[,fn, with=FALSE]

What I don't know how to do is the data.table equivalent of the following:

    > factorFunction <- function(df, fn) {
      df[,fn] <- as.factor(df[,fn])
      return(df)
     }

If I set fn="x" and call factorFunction(data.frame(dt),fn) it works just fine.

So I try it with a data.table, but this doesn't work

    > factorFunction <- function(dt, fn) {
      dt[,fn, with=FALSE] <- as.factor(dt[,fn, with=FALSE])
      return(dt)
     }

Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list?

1
  • By the way, here's one (very unidiomatic) way: dt[,fn] <- as.factor(dt[,fn, with=FALSE][[1]]) It's very close to what you've written, I think. Commented Jun 18, 2015 at 19:52

3 Answers 3

4

You can try

 dt[,(fn):= factor(.SD[[1L]]),.SDcols=fn]

If there are multiple columns, use lapply(.SD, factor)

Wrapping it in a function

factorFunction <- function(df, fn) {
 df[, (fn):= factor(.SD[[1L]]), .SDcols=fn]
 }

 str(factorFunction(dt, fn))
 #Classes ‘data.table’ and 'data.frame':    4 obs. of  3 variables:
 #$ x: num  1 2 3 4
 #$ y: chr  "y" "n" "y" "m"
 #$ z: Factor w/ 4 levels "3","8","egg",..: 4 1 2 3
Sign up to request clarification or add additional context in comments.

1 Comment

@DavidWagle Glad to know that it works. We are specifying the columns to be used as subset in .SDcols and do the operation in .SD[[1L]]. Here, I used 1L to convert the list to vector. More generally, it would be lapply(.SD, yourfuncton)
3

Similar to @akrun's answer:

class(dt[[fn]])
#[1] "character"

setFactor <- function(DT, col) {
  #change the column type by reference
  DT[, c(col) := factor(DT[[col]])]
  invisible(NULL)
  }

setFactor(dt, fn)
class(dt[[fn]])
#[1] "factor"

1 Comment

Or just use set: setFactor <- function(DT, col) set(DT, j = col, value = factor(DT[[col]]) )
2

I don't recommend this, since it's very unidiomatic:

factorFunction <- function(df,col){
  df[,col] <- factor(df[[col]])
  df
} 

The upside is that it works in both base R and data.table:

df <- setDF(copy(dt))

class(df[[fn]]) # character
df <- factorFunction(df,fn)
class(df[[fn]]) # factor

class(dt[[fn]]) # character
dt <- factorFunction(dt,fn)
class(dt[[fn]]) # factor

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.