1

I am using a list of variables to download and create dataframes in R. I'd like to be able to use this list to make changes to different columns in each dataframe, but I am having trouble calling particular columns using the list of variables.

countries= c("USA","CHN")

for (i in 1:length(countries)){
    download.file(url[i],savedata[i])
    assign(countries[i],xmlToDataFrame(savedata[i]))
}

Now I have dataframes that look like this:

head(USA)
        indicator       country date          value decimal
1 GDP (current US$) United States 2012 15684800000000       0
2 GDP (current US$) United States 2011 14991300000000       0
3 GDP (current US$) United States 2010 14419400000000       0
4 GDP (current US$) United States 2009 13898300000000       0
5 GDP (current US$) United States 2008 14219300000000       0
6 GDP (current US$) United States 2007 13961800000000       0

And I would like to go through and make several changes, such as formatting the date column with the as.date() function, or changing the units of the value column, but I want to be able to do the same to both dataframe (or an arbitrary number in case I increase the length of countries.

However, whenever I try to do this I can seem to use the list of countries in the countries variable to get 'inside' each data frame. My initial guess was putting something like this in a loop:

assign(paste(countries[i],"date",sep="$"),
    as.date(get(paste(countries[i],"date",sep="$")))

In particular, I get confused about how the get(paste(countries[i])) works if I am not trying to get the particular column date, and how the paste(countries[i],"date",sep="$") prints the correct name, but I can't seem to get just the one column I'd like to manipulate.

Additionally, I realize loops are not the ideal way of doing this, but I've been having the same problem with the apply functions, though I am likely having trouble with them due to my lack of experience. Suggestions for either how to do it in a loop, or with out, would be much appreciated. Super R novice here, just trying to learn. Also, if you've come across a clear explanation/answer for this somewhere else, I'd appreciate you pointing me towards it.

1
  • You might find it easier to flatten the structure and put all data in a single data.frame. Using the list of data frames from @Ferdinand answer's below you can stack them all in a dataframe with the command Reduce(rbind, mylist). Once you have one data frame only, formatting operations become much easier. Commented Dec 29, 2016 at 15:48

1 Answer 1

3

It's much easier if you use lists. Start with an empty one:

mylist = list()

Then change this:

assign(countries[i],xmlToDataFrame(savedata[i]))

to this:

mylist[[i]] <- xmlToDataFrame(savedata[i])

Then make a function that does your formatting, for instance:

f <- function(df){
    within(df, date <- as.date(date))
}

And use lapply to apply it to all dataframes:

mylist2 <- lapply(mylist, f)

If you want to access dataframes by name, use this:

names(mylist2) <- countries

And test:

mylist2[["USA"]]
Sign up to request clarification or add additional context in comments.

1 Comment

That was very helpful and informative, both in understanding how to use lists and apply functions, as well as bringing my attention to the within function. Thought I was having some trouble in the end with it, but it was my own confusion, this worked well and was educational. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.