10

I have a programming doubt in R and I have no idea how to solve it after spending hours looking at potential responses on the internet and on Stack Overflow.

I have a factor variable in a column of a data.frame that looks like this:

Columnname
agsgssg
agsgssg
agsgssg
adgatata
ahagha
ahagha
ahagha
ahagha
aghaatah
ghssghs
ghssghs
ghssghs

The factor variable is not directly transformable into numeric with as.numeric(as.character()) because each level is a string, not a number.

What I would need is

Columnname            Numericcolumnname
agsgssg                        1
agsgssg                        1
agsgssg                        1
adgatata                       2
ahagha                         3   
ahagha                         3  
ahagha                         3   
ahagha                         3  
aghaatah                       4  
ghssghs                        5
ghssghs                        5   
ghssghs                        5  

I have tried several approaches including using levels() for the factor variable, using freq() for the factor variable trying to figure out how many rows there are for each level and then making a repeated number for each level of the factor with several "for" loops without success.

I feel that it should have a very simple solution, I am just not figuring it out.

Thank you for your consideration

4
  • from the example df$Numericcolumnname <- as.numeric(Columnname) Commented Feb 22, 2016 at 16:34
  • match(df$Columnname, unique(df$Columnname))? Commented Feb 22, 2016 at 16:34
  • 1
    @PierreLafortune Your solution will not work if the levels are in different order Commented Feb 22, 2016 at 16:36
  • The user may not be looking for a particular order. It is not mentioned or hinted at. The intuition appears to be the underlying numeric equivalent of the factor variable as.numeric(x). Commented Feb 22, 2016 at 16:38

1 Answer 1

13

In case, the levels are in different order, we can convert the column to factor with levels specified as the unique elements in that column, and then coerce it to numeric/integer.

df1$Numericcolumnname <- as.numeric(factor(df1$Columnname, 
                  levels=unique(df1$Columnname)))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.