Change multiple column names in R

Question

I have a dataframe called "wheat_cities"

The columns in my dataframe are as follows

 "Date"                                  "Wheat..Maximum.Price"                 
 [3] "Wheat..Minimum.Price"                  "Wheat..Modal.Price"                   
 [5] "Wheat..North.Zone..Agra"               "Wheat..North.Zone..Amritsar"          
 [7] "Wheat..North.Zone..Bhatinda"           "Wheat..North.Zone..Chandigarh"        
 [9] "Wheat..North.Zone..Dehradun"           "Wheat..North.Zone..Delhi"             
[11] "Wheat..North.Zone..Gurgaon"            "Wheat..North.Zone..Haldwani"          
[13] "Wheat..North.Zone..Hisar"              "Wheat..North.Zone..Jammu"             
[15] "Wheat..North.Zone..Kanpur"             "Wheat..North.Zone..Karnal"            
[17] "Wheat..North.Zone..Lucknow"            "Wheat..North.Zone..Ludhiana"          
[19] "Wheat..North.Zone..Mandi"              "Wheat..North.Zone..Panchkula"         
[21] "Wheat..North.Zone..Shimla"             "Wheat..North.Zone..Srinagar"          
[23] "Wheat..North.Zone..Varanasi"           "Wheat..West.Zone..Ahmedabad"          
[25] "Wheat..West.Zone..Bhopal"              "Wheat..West.Zone..Bhuj"               
[27] "Wheat..West.Zone..Gwalior"             "Wheat..West.Zone..Indore"             
[29] "Wheat..West.Zone..Jabalpur"            "Wheat..West.Zone..Jaipur"             
[31] "Wheat..West.Zone..Jodhpur"             "Wheat..West.Zone..Kota"               
[33] "Wheat..West.Zone..Mumbai"              "Wheat..West.Zone..Nagpur"             
[35] "Wheat..West.Zone..Panaji"              "Wheat..West.Zone..Raipur"             
[37] "Wheat..West.Zone..Rajkot"              "Wheat..West.Zone..Rewa"               
[39] "Wheat..West.Zone..Sagar"               "Wheat..West.Zone..Surat"              
[41] "Wheat..East.Zone..Bhagalpur"           "Wheat..East.Zone..Bhubaneshwar"       
[43] "Wheat..East.Zone..Cuttack"             "Wheat..East.Zone..Patna"              
[45] "Wheat..East.Zone..Purnia"              "Wheat..East.Zone..Ranchi"             
[47] "Wheat..East.Zone..Rourkela"            "Wheat..East.Zone..Sambalpur"          
[49] "Wheat..East.Zone..Siliguri"            "Wheat..North.East.Zone..Aizwal"       
[51] "Wheat..North.East.Zone..Dimapur"       "Wheat..North.East.Zone..Guwahati"     
[53] "Wheat..North.East.Zone..Itanagar"      "Wheat..North.East.Zone..Shillong"     
[55] "Wheat..South.Zone..Bengaluru"          "Wheat..South.Zone..Chennai"           
[57] "Wheat..South.Zone..Coimbatore"         "Wheat..South.Zone..Dharwad"           
[59] "Wheat..South.Zone..Dindigul"           "Wheat..South.Zone..Ernakulam"         
[61] "Wheat..South.Zone..Hyderabad"          "Wheat..South.Zone..Karimnagar"        
[63] "Wheat..South.Zone..Kozhikode"          "Wheat..South.Zone..Mangalore"         
[65] "Wheat..South.Zone..Mysore"             "Wheat..South.Zone..Palakkad"          
[67] "Wheat..South.Zone..Port.Blair"         "Wheat..South.Zone..Puducherry"        
[69] "Wheat..South.Zone..Thiruchirapalli"    "Wheat..South.Zone..Thiruvananthapuram"
[71] "Wheat..South.Zone..Thrissur"           "Wheat..South.Zone..Tirunelveli"       
[73] "Wheat..South.Zone..Vijaywada"          "Wheat..South.Zone..Visakhapatnam"     
[75] "Wheat..South.Zone..Warangal"           "Wheat..South.Zone..Wayanad"           
>

I want to change the column names such that for column 5-76, I just get the name after the second "..". For column 2 and 3, I get the name after the first ".."

Since the length of characters differs, I am unable to use the substring command.

Please help. Thanks in advance!

akrun · Accepted Answer · 2018-01-24 11:40:43Z

3

We could do this with sub to match characters (.*) followed by two dots (\\.{2}), capture the characters after that in a group ((.*)) until the end ($) of the string and replace with the backreference (\\1) of the captured group

names(data) <- sub(".*\\.{2}(.*)$", "\\1", names(data))
names(data)
#[1] "Date"          "Maximum.Price" "Minimum.Price" "Agra"

data

data <- data.frame(Date = c("2013-01-01", "2013-01-02"), 
   Wheat..Maximum.Price = 5:6, Wheat..Minimum.Price = 1:2, 
     Wheat..North.Zone..Agra = 6:7, stringsAsFactors = FALSE)

answered Jan 24, 2018 at 11:40

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

minem · Accepted Answer · 2018-01-24 11:47:34Z

3

names(data) <- gsub("Wheat..", "", names(data), fixed = T)
names(data) <- gsub("North.Zone..", "", names(data), fixed = T)
names(data)
# [1] "Date" "Maximum.Price" "Minimum.Price" "Modal.Price"   "Agra" "Amritsar"

First we remove "Wheat.." from all column names and then we remove "North.Zone..".

edited Jan 24, 2018 at 11:47

answered Jan 24, 2018 at 11:40

minem

3,6502 gold badges19 silver badges31 bronze badges

1 Comment

Sandy Over a year ago

Thank you for your solution, it is very easy to follow.

Dave · Accepted Answer · 2018-01-24 12:03:54Z

0

You could use strsplit, this allows you to split string using specific values, ".." for example. As you said, from 5 to above, you want the name after the last "..", and from 2 to 4 you want the third name between "..", and you can do it with this instance:

change_names <- strsplit(colnames(wheat_cities), '[..]')

for(i in 2 : ncol(wheat_cities)){
  if(i %in% c(2 : 4)){
    colnames(wheat_cities)[i] <- change_names[[i]][3]
  }else{
    colnames(wheat_cities)[i] <- last(change_names[[i]])
  }
}

answered Jan 24, 2018 at 12:03

Dave

3592 silver badges16 bronze badges

Collectives™ on Stack Overflow

Change multiple column names in R

3 Answers 3

data

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

data

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related