2

I have this simple "for loop" written in R which I have to convert in to C++. Following is the reproducible example of R code:

# Parameters required 
a <- 1.8
b <- 1
time.dt <- 0.1
yp <- 40 
insp.int <- 7
ph <- 2000

dt <- seq(0,ph,time.dt) # Time sequence
MD.set <- c(seq(insp.int, ph, insp.int), ph) # Decision points to check and set next inspection date

# Initialization
cum.y <- rep(0,length = length(dt)) 
init.y <- 0
flag <- FALSE

# At each iteration, the following loop generates a gamma distributed random number and cum.y keeps taking cumulative sum
# The objective is to return a vector cum.y with a conditional cumulative sum of previous iteration 
# When dt[i] matches any values in MD.Set AND corresponding cum.y[i] is also >= yp it changes the flag to true (the last if)
# At the start of the loop it checks if dt[i] matches any values in MD.Set AND flag is also true. If yes, then cum.y is reset to 0. 

for (i in 2:length(dt)){
  if (dt[i] %in% MD.set && flag == TRUE){
    cum.y[i] <- 0
    init.y <- 0
    flag <- FALSE
    next
  } else {
    cum.y[i] <- init.y + rgamma(n = 1, shape = a*time.dt, scale = b)
    init.y <- cum.y[i]
    if (dt[i] %in% MD.set && cum.y[i] >= yp){
      flag <- TRUE
    }
  }
}

res <- cbind(dt, cum.y)

Having no prior experience with C++, I am running into numerous problems when trying to do so. I need to do this conversion only to be able to use it in Rcpp package in R. Because the code runs slowly in R, specially when time.dt becomes smaller and I am guessing C++ would do the job faster. Can you help with this?

UPDATE 2: This is my proposal for the conversion with the help in comments and answer. However, I am not sure what is the equivalent of next in C++. If I use continue it keeps going through the rest of the codes (and execute the ones after else. If I use break then it comes out of the loop after the condition is true.

NumericVector cumy(double a, double b, double timedt, NumericVector dt, NumericVector MDSet, double yp){
  bool flag = false;
  int n = dt.size();
  double total = 0;
  NumericVector out(n);
  unordered_set<int> sampleSet(MDSet.begin(), MDSet.end());
  
  for (int i = 0; i < n; ++i){
    if (sampleSet.find(dt[i]) != sampleSet.end() && flag == true){
      out[i] = 0;
      total = 0;
      flag = false;
      continue;
    } else {
      out[i] = total + rgamma(1, a*timedt, b)[0];
      total = out[i];
      if (sampleSet.find(dt[i]) != sampleSet.end() && out[i] >= yp){
        flag = true;
      }
    }
  }
  return out;
}
7
  • 1
    What problems did you run into? Show your attempt, and what the problems were Commented May 3, 2021 at 11:11
  • Please check the update. I started with the part after "else" in my R code and added it in the question. Not sure how to convert this %in% of R to C++. I got a hint to use unordered_set and find() but not being able to correctly use it Commented May 3, 2021 at 11:21
  • 1
    Assuming NumericVector is just a vector of ints, then you'd do: unordered_set<int> sampleSet(MDSet.begin(), MDSet.end()); Commented May 3, 2021 at 11:27
  • 1
    This line is suspicious: sampleSet.find(dt[i]) == sampleSet.begin() as the order in unordered_set is not preserved (hence the name) so you do not know which one is the first. Didn't you mean sampleSet.end()? (possibly with !=) Commented May 3, 2021 at 11:48
  • 1
    @Rel_Ai Yes I mean something like that. No they are not the same: sampleSet.begin() is an iterator to an existing item of the unordered_set while sampleSet.end() is an iterator to a non-existing item returned by functions that does not find things in the container. Note that there are likely other items in between the two iterators, so not being the first does not mean that the item is not found (the returned iterator could be sampleSet.begin()+1 for example. Commented May 3, 2021 at 12:18

1 Answer 1

4

The error you are getting is simply because there's no automatic conversion from NumericVector to std::unordered_set<int>. You can fix this by doing:

std::unordered_set<int> sampleSet( MDSet.begin(), MDSet.end() )

This calls the unordered_set's constructor with beginning and end iterators of MDSet, which will populate the set with all values.

There's also another issue in your code:

if (sampleSet.find(dt[i]) == sampleSet.begin())

This will only be true if dt[i] is found at the first element of sampleSet. From your r code, I'm assuming you are just checking to see if the value dt[i] is within sampleSet, in which case, you want:

if (sampleSet.find(dt[i]) != sampleSet.end())

In C++, STL find methods generally return an iterator, and when the value is not found, it returns the end iterator, so if find's return value is not end, then the value was found within the set.

Sign up to request clarification or add additional context in comments.

3 Comments

So now I think I understand the use of unordered_set. I have updated the code in EDIT2 and I believe I am almost there except for translating the next into something equivalent for C++. I have found two options break and continue but they don¨t seem to be appropriate.
You shouldn't edit your question to ask another question. You can make a new question which is just specific to that new part. I don't really know what next does in r, but sounds like continue to me.
Actually it was not about the break statement, it was rather redundant. After I changed unordered_set into double, it seems to be working (unordered_set<double> sampleSet(MDSet.begin(), MDSet.end())).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.