3

I would like to run through specific columns in a dataframe and replace all NAs with 0s using a loop.

extract = read.csv("2013-09 Data extract.csv")
extract$Premium1[is.na(extract$Premium1)] <- 0
extract$Premium1

gives me the required result for Premium1 in dataframe extract, but I would like to loop through all 27 columns of premiums, so what I am trying is

extract = read.csv("2013-09 Data extract.csv")

for(i in 1:27) { 
  thispremium <- get(paste("extract$Premium", i, sep="")) 
  thispremium[is.na(thispremium)] <- 0
}

which gives

Error in get(paste("extract$Premium", i, sep = "")) : 
  object 'extract$Premium1' not found

Any idea on what is causing the error?

4
  • 1
    get() will not parse a string. Perhaps: get("extract")[[paste0("Premium",i)]] although it looks rather tortured. Why do you need to get 'extract'. Why not just: extract[[paste0("Premium",i)]] Commented Oct 15, 2013 at 14:39
  • Give a look at this answer: link Commented Oct 15, 2013 at 14:44
  • Thank you for that observation, DWin. I am using for(i in 1:27) { extract[[paste0("Premium", i)]][is.na(extract[[paste0("Premium", i)]])] <-0 } now which gives the required result. Commented Oct 15, 2013 at 14:46
  • @user1886721 I do not want to replace all NAs in my dataframe; nevertheless an interesting read, thanks. Commented Oct 15, 2013 at 14:48

2 Answers 2

2

Do you need the loop because of other requirements? Because it works just fine without one:

extract[is.na(extract)] <- 0

If you want to do the replacement for some columns only, select those columns first, perform the replacement, and substitute the columns back into the original set:

first5 <- extract[, 1 : 5]
first5[is.na(first5)] <- 0
extract[, 1 : 5] <- first5

More generally loops can (and should) be almost avoided in R – especially when manipulating data frames). Often operations vectorise automatically (like above). When they don’t, functions of the apply family can be used.

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks Konrad. I have seen this solution elsewhere on the internet but in my case I wish to remove the NAs only from a selected set of columns.
Thanks Konrad, I take note of your solution. Is there a way to make the column references relative so that the script will be editing the correct columns when the layout of the underlying dataframe changes?
@Adriaan Yes, of course. In fact you should use variables instead of hard-coded absolute columns. Just substitute the (variable) names of the columns instead of the range I’ve used. For instance, use something like c('Premium1', 'Premium3') instead of 1:5.
Wow, fantastic! What I am using now is working <- extract[, c(paste0("Premium", 1:27))] working[is.na(working)] <- 0 extract[, c(paste0("Premium", 1:27)) ] <- working and it solves the problem perfectly. Thanks!
@Adriaan In that case, no need for the c(…) even. Just use paste0("Premium", 1:27) directly.
|
2

How about

for (colname in names(extract))
  extract[[colname]][is.na(extract[[colname]])] <- 0

(or even extract[is.na(extract)] <- 0)

Or, if you are not doing it to all the columns (I think I misread your question):

for(i in 1:27) { 
  colname <- paste0("Premium",i)
  extract[[colname]][is.na(extract[[colname]])] <- 0
}

Alternatively, you don't really need to know the number of such columns:

premium <- grep("^Premium[0-9]*$",names(extract))
extract[premium][is.na(extract[premium])] <- 0

3 Comments

Thanks sds; yes, I would like only to tamper with specific columns. Your and Dwin's solutions solve the problem but I found Konrad's solution more elegant :)
I get this error: Warning message: In grep(names(extract), "^Premium") : argument 'pattern' has length > 1 and only the first element will be used. I am not sure what this solution is trying to achieve, but if it is to select all columns with "Premium" in the heading then I'm weary of implementing it because there are other columns (such as ReasPremium1 for reinsurance) that I would like not to touch.
Sorry, fixed arg order. This regex selects the columns which start with "Premium" and then have digits.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.