5

I have an Rcpp function inside an R function. The R function produces some object (say a large list) and feeds it to the Rcpp function. Inside the Rcpp function, I process the R object, load the results to a number of C++ classes. Now the R object becomes useless. I want to wipe out the R object to make a memory-sufficient environment for the main algorithms.

The idea is:

// [[Rcpp::export]]
void cppFun(List structuredData)
{
  // copy structuredData to C++ classes
  // Now I want structuredData gone to save memory
  // main algorithms ...
}

/***R
rFun(input)
{
  # R creates structuredData from input
  cppFun(structuredData)
}
*/

I tried calling R's "rm()" in C++ but it can only identify the object names in R's global environment. For example:

// [[Rcpp::export]]
void cppFun()
{
  Language("rm", "globalDat").eval(); 
  Language("gc").eval();
}

/***R
globalDat = 1:10
ls() # shows "globalDat" is created.
cppFun() # shows "globalDat" is no longer in the environment.
ls()
*/

However, the following does not work:

// [[Rcpp::export]]
void cppFun()
{
  Language("rm", "localDat").eval(); 
  Language("gc").eval();
}

/***R
rFun <- function (x)
{
  locDat = x
  ls() //  shows "x" and "locDat" are created
  cppFun()
  ls()
}

globalDat = 1:10
ls() # shows "globalDat" is created.
rFun(globalDat) # it will print "x","locDat" twice and a warning message: In rm("localDat") : object 'localDat' not found

locDat = globalDat
rFun(globalDat) # this will still remove "locDat" from the global environment.
*/

Am I on the right track to the goal? Is there any better way?

Thank you!

Thought of a hacky solution:

  1. Write a shell class wrapping references to all the necessary C++ structured data classes.

  2. In the R function, (i) process the input; (ii) feed the structured R data to the Rcpp function; (iii) in the Rcpp function, new a shell class object, load the structured R data; (iv) memcpy the shell class pointer to a double (8 bytes, if 32-bit system, use int); (v) return the double; (vi) return the double out of the R function. Now the structured R object dies while the newed C++ shell object still lives. Call gc() for garbage collection.

  3. Feed the double to the main C++/Rcpp function. memcpy this double to a shell class pointer. delete the shell class pointer before function returns.

Tests show the above works. Just found "external pointer" or Rcpp::XPtr designed for a similar purpose?

1 Answer 1

6

Doing something along these lines would be known as an antipattern, or highly counterproductive, in Rcpp. Why this is problematic is Rcpp performs a shallow copy when moving an R object to C++, which means the R object shares it's memory allocation with the instantiated C++ object. If you were to remove the R object while a C++ object references it, then you may run into trouble later in the process as a segmentation fault (segfault) would likely occur.

Now, if you intend to do a deep copy from an R object into a C++ structure, this wouldn't be quite as toxic. When doing deep copies, the data does not reference the original R object. However, this is not the default schema for Rcpp.

With this being said, I strongly discourage deleting objects mid-process. If you truly are memory strapped, try "chunking"/dividing the data more, perform operations with a database, buy additional RAM, or wait for ALTREP.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.