Replacing rows in R

Question

In R am reading a file with comments as csv using

read.data.raw = read.csv(inputfile, sep='\t', header=F, comment.char='')

The file looks like this:

#comment line 1
data 1<tab>x<tab>y
#comment line 2
data 2<tab>x<tab>y
data 3<tab>x<tab>y

Now I extract the uncommented lines using

comment_ind = grep( '^#.*', read.data.raw[[1]])
read.data = read.data.raw[-comment_ind,]

Which leaves me:

 data 1<tab>x<tab>y
 data 2<tab>x<tab>y
 data 3<tab>x<tab>y

I am modifying this data through some separate script which maintains the number of rows/cols and would like to put it back into the original read data (with the user comments) and return it to the user like this

#comment line 1
modified data 1<tab>x<tab>y
#comment line 2
modified data 2<tab>x<tab>y
modified data 3<tab>x<tab>y

Since the data I extracted in read.data preserves the row names row.names(read.data), I tried

original.read.data[as.numeric(row.names(read.data)),] = read.data

But that didn't work, and I got a bunch of NA/s

Any ideas?

How exactly did it change the data? If it turned factors into characters, or similar changes in data types, that would account for the NAs. — David Robinson
– David Robinson, Commented Aug 27, 2012 at 19:52
Also, you're going to get NAs after the comment line in any column if you force the column to be numeric. R wasn't really meant to read in comment data along with the data frame, though you could find ways around it. In any case, you'd have to be more specific about the type of data you read in and how you modified it — David Robinson
– David Robinson, Commented Aug 27, 2012 at 19:58
The data I'm reading in is a 5 column formatted data: Column 1-3(numeric) column 4-5 character strings In most cases I am replacing values in specific cells of the data frame (example data[5,8]=NA) and sometimes replacing the whole column (example data[[3]]=1:100) I forced R to read the comment data, because when I set comment.char to '#', I lost the comment lines. So by getting R to read it that way, I can extract the uncommented lines, leaving commented lines behind. At least that was my logic behind my choices — Omar Wagih
– Omar Wagih, Commented Aug 27, 2012 at 20:22
Why not edit your original question to include a fully reproducible example? — David Robinson
– David Robinson, Commented Aug 27, 2012 at 20:24

dcarlson · Accepted Answer · 2012-08-27 21:41:06Z

1

Does this do what you want?

read.data.raw <- structure(list(V1 = structure(c(1L, 3L, 2L, 4L, 5L),
   .Label = c("#comment line 1", "#comment line 2", "data 1", "data 2", 
   "data 3"), class = "factor"), V2 = structure(c(1L, 2L, 1L, 2L, 2L), 
   .Label = c("", "x"), class = "factor"), V3 = structure(c(1L, 2L, 1L,
   2L, 2L), .Label = c("", "y"), class = "factor")), .Names = c("V1", 
   "V2", "V3"), class = "data.frame", row.names = c(NA, -5L))

comment_ind = grep( '^#.*', read.data.raw[[1]])
read.data <- read.data.raw[-comment_ind,]
# modify V1
read.data$V1 <- gsub("data", "DATA", read.data$V1)
# rbind() and then order() comments into original places
new.data <- rbind(read.data.raw[comment_ind,], read.data)
new.data <- new.data[order(as.numeric(rownames(new.data))),]

answered Aug 27, 2012 at 21:41

dcarlson

11.1k2 gold badges17 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Omar Wagih Over a year ago

Ah! Sorting by the row names, I didn't think of that! Works like a charm! THANKS!

Collectives™ on Stack Overflow

Replacing rows in R

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related