The function below, used after sorting within the grouping variable grp, is intended to provide cumulative share that can be used for quantile measurement. It's rather odd structure is because all of these variables are about 6 million lines long, and every time I copy another variable and hold it in memory it increases the chance that my analysis will crash, so I try not to hold more than two variables in memory at any one time. (testX() is just my little object testing progeam -- does an str, a summary, etc.
popWt <- c(1,2,3,1,2,3,4)
year <- factor(c(1,1,1,2,2,2,2))
So the desired outcome from the data above after unlisting is roughly:
0.166666667 0.5 1 0.1 0.3 0.6 1
cumPopShare.L produces cumulative shares of the population, for groups defined by a factor (grp), and with an optional logical vector to select sub-samples prior to cumulation. Often results are most meaningful if population is sorted prior to cumulation.
cumPopShare.L <- function(pop, select.L=NULL, grp){
if (!is.null(select.L)) {pop <- pop * select.L}
groups <- split(pop, grp)
gLengths <- lapply(groups, FUN=length)
gSums <- lapply(groups, FUN=sum)
function(groups, gLengths, gSums)
out.L <- list(numeric())
str(gLengths[1])
out.L[[1]] <-list(numeric(length=as.numeric(gLengths[1])))
testX(out.L)
for (i in length(groups)){
str(gLengths[i])
testX(out.L)
out.L[[i]] <- rep_len(1/gSums[[i]], length.out=gLengths[[i]]) *
cumsum(groups[[i]])
}
out.L
}
cumPopShare.V <- unlist(cumPopShare.L(pop=popWt, grp=year), use.names=FALSE)
I am getting several slightly different versions of this error:
List of 1 <- $ 1: int 3
>Error in out.L[[1]] <- list(numeric(length = as.numeric(gLengths[1]))) :
> object 'out.L' not found
This error is from the second appearance of out.L, but when i put a summary or str in after the first, it also denied that out.L exists.
I find this puzzling because in both cases I am trying to assign something to elements of the out.L variable with [[<-. I have tested these assignments at the command line level, and both of them work fine, so I am guessing that this is a scoping issue. But I'v been bashing my head against it for hours, and all i have gotten is a sore head.
This is R 3.0.2, running under RStudio, on a a cranky old windows XP machine.
Any help or suggestions would be much appreciated
Peace, andrewH