8

I have a factor in a data frame with levels like hot, warm, tepid, cold, very cold, freezing. I want to map them to an integer column with values in the range [-2, 2] for regression, with some values mapping to the same thing. I want to be able to specify the explicit mapping, so that very hot words map to 2, very cold words to -2, etc. How do I do this cleanly? I would love a function that I just pass some named list to, or something.

3
  • ouch, that downvote was harsh Commented Jan 28, 2013 at 7:28
  • 2
    FYI, [-2, 2] represent only 5 values, while your sample levels represent 6 values. Commented Jan 28, 2013 at 7:46
  • @AnandaMahto: in effect he says "with some values mapping to the same value". Commented Jan 28, 2013 at 10:08

2 Answers 2

17

Assume a factor vector x holds the categories.

temperatures <- c("hot", "warm", "tepid", "cold", "very cold", "freezing")
set.seed(1)
x <- as.factor(sample(temperatures, 10, replace=TRUE))
x
[1] warm     tepid    cold     freezing warm     freezing freezing cold    
[9] cold     hot     
Levels: cold freezing hot tepid warm

Create a numeric vector temp.map with the mapping. Note that "hot" and "warm" map to the same value below.

temp.map <- c("hot"=2, "warm"=2, "tepid"=1, "cold"=0, "very cold"=-1, "freezing"=-1)    
y <- temp.map[as.character(x)]
y
warm    tepid     cold freezing     warm freezing freezing     cold 
   2        1        0       -1        2       -1       -1        0 
cold      hot 
   0        2 
Sign up to request clarification or add additional context in comments.

Comments

8

A factor can easily be converted to an integer using as.integer.

For instance:

>temperatures <- c("Hot", "Warm", "Tiepid", "Cold", "Very cold", "Freezing")
> set.seed(12345)
> a <- sample(temperatures, 10, r=T)
> a <- factor(a, levels = temperatures)
> a
 [1] Very cold Freezing  Very cold Freezing  Tiepid    Hot       Warm     
 [8] Cold      Very cold Freezing 
Levels: Hot Warm Tiepid Cold Very cold Freezing
> as.integer(a)
 [1] 5 6 5 6 3 1 2 4 5 6

If you need it in the [-2;2] range, you would just do

> as.integer(a)-3
  [1]  2  3  2  3  0 -2 -1  1  2  3

2 Comments

I believe this solution does not address the requirement of mapping multiple levels to the same numeric value.
@Leo: ah, did not notice that part, your solution works well for that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.