0

I am having trouble using replicate() function in R to generate random numbers with Rcpp function. Consider the following function in R:

trial <- function(){rnorm(1)}
replicate(10, trial())

It generate 10 random numbers from Gaussian distribution. It works completely fine and produces result like this:

 [1]  0.7609912 -0.2949613  1.8684363 -0.3358377 -1.6043926  0.2706250  0.5528813  1.0228125 -0.2419092 -1.4761937 

However, I have a c++ function getRan() that generate a random number from Gaussian distribution. I again used replicate to call the function like this:

replicate(10,getRan())

It creates a vector of same number like this:

> replicate(10,getRan())
 [1] -1.374932 -1.374932 -1.374932 -1.374932 -1.374932 -1.374932 -1.374932 -1.374932 -1.374932 -1.374932
> replicate(10,getRan())
 [1] -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785 -0.3273785
> replicate(10,getRan())
 [1] -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953 -0.7591953
> replicate(10,getRan())
 [1] -1.698935 -1.698935 -1.698935 -1.698935 -1.698935 -1.698935 -1.698935 -1.698935 -1.698935 -1.698935

However, if i call the function multiple times, it works fine:

 getRan()
[1] 1.345227
> getRan()
[1] 0.3555393
> getRan()
[1] 1.587241
> getRan()
[1] 0.5313518

So what is the problem here? Does the replicate() function repeat the same function return from getRan() instead of calling getRan() multiple times? Is it a bug?

PS: I know I can use rnorm(n) to generate n normal random number, however, I want to use the c++ function to do more complex calculations based on generating random numbers

PPS: this is my c++ code:

double getRan(){
  unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
  std::default_random_engine generator(seed);
  std::normal_distribution<double> distribution (0.0,1.0);
  double epi = distribution(generator);
  return epi;
}
8
  • Interesting. Could you check what happens if you do lapply(1:10, getRan)? I suspect this is something to do with replicate taking an expression, but beyond this I ain't sure... Commented May 26, 2018 at 14:43
  • 1
    Post a reproducible example. This is so far just a (possibly well-intentioned, but still useless) rant ... Commented May 26, 2018 at 14:50
  • 1
    My crystal ball tells me that in your C++ a RNG is instantiated with the same seed. For more we need a minimal reproducible example. Commented May 26, 2018 at 16:36
  • @dash2 it produces: > lapply(1:10,getRan) Error in FUN(X[[i]], ...) : unused argument (X[[i]]) Commented May 26, 2018 at 16:40
  • @RalfStubner, thanks, i just realised the same thing too, i used time as the seed, so when the function is called with replicate(), the seed is the same, im figuring out how to solve it, here is my C++ code: double getRan(){ unsigned seed = std::chrono::system_clock::now().time_since_epoch().count(); std::default_random_engine generator(seed); std::normal_distribution<double> distribution (0.0,1.0); double epi = distribution(generator); return epi; } Commented May 26, 2018 at 16:42

2 Answers 2

2

Here is a counter-example showing that it works just fine:

Code

trialR <- function() { rnorm(1) }
Rcpp::cppFunction("double trialC() { return R::rnorm(0.0, 1.0); }")
Rcpp::cppFunction("Rcpp::NumericVector trialSugar() { return Rcpp::rnorm(1.0, 0.0, 1.0); }")

set.seed(123); replicate(3, trialR())
set.seed(123); replicate(3, trialC())
set.seed(123); replicate(3, trialSugar())

Output

Via Rscript to ensure fresh session etc pp:

edd@rob:/tmp$ Rscript so50543659.R 
[1] -0.560476 -0.230177  1.558708
[1] -0.560476 -0.230177  1.558708
[1] -0.560476 -0.230177  1.558708
edd@rob:/tmp$ 
Sign up to request clarification or add additional context in comments.

Comments

1

Dirks answer is correct. You should use R's RNG. If you insist on using a RNG in C++, you can use something like this:

#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
#include <random>

namespace {
  std::default_random_engine generator(std::random_device{}());
  std::normal_distribution<double> distribution (0.0,1.0);  
}

// [[Rcpp::export]]
double getRan(){
  return distribution(generator);
}

/*** R
replicate(10,getRan())
*/

This avoids creating a new instance of std::default_random_engine (and std::normal_distribution) at every function call. This is important since the properties of a RNG are only guaranteed for repeated draws from one RNG. Not for repeated draws from different RNGs seeded with (hopefully different) seeds.

BTW, on my system your original code does not produce the same number multiple times. If you are having troubles with std::random_device and are working on windows, you might be affected by this mingw bug. In that case seed by time is the better alternative.

2 Comments

What is the purpose of the namespace declaration if you don't include a name?
@thc It‘s an anonymous namespace. See for example here for more stackoverflow.com/questions/5793334/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.