Using if statements with text strings in function - R

Question

I am attempting to split my data into three separate dataframes (train, test, validate) using a function but it is not returning the results I require.

This is my function:

   splitData <- function(type) {
    set.seed(1337)
    rowTrain <- createDataPartition(y = cleaned.data$CHURN, p = 0.7, list = FALSE)
    bufferDF <- cleaned.data[-rowTrain,]
    rowTest <- createDataPartition(y = cleaned.data$CHURN, p = 0.50, list = FALSE)
    if(type == "train") {cdTrain <- cleaned.data[rowTrain,]}
    if(type == "train") {cdTrain}
    if(type == "test") {cdTest <- cleaned.data[rowTest,]}
    if(type == "test") {cdTest}
    if(type == "validate") {cdValidate <- bufferDF[-rowTest,]}
    if(type == "validate") {cdValidate}
}

Could you please shine some light on where I am going wrong?

Cheers

why do you have two if statements for each type? Why not just do, e.g., if(type=="train") {cdTrain<-cleaned.data[rowTrain,]; cdTrain}? — iod
– iod, Commented Mar 24, 2018 at 4:55
@doviod great point... still reasonably new to R so learning bit by bit so cheers for that tip. I'm running the function using the command cdTrain <- splitData(train) to no avail. Also have tried splitData("train") and splitData(type = "train"). Where am I going wrong? — Nalhcal
– Nalhcal, Commented Mar 24, 2018 at 5:01

iod · Accepted Answer · 2018-03-24 05:13:28Z

The function missing() examines whether an argument was passed to the function it is within. Passing something like train=="y" is meaningless, because train=="y" is not an argument for the function splitData. If you're trying to make sure that the various variables were passed before you do something, it should be if(!missing(train)).

However, I'm not sure what your function hopes to achieve - it doesn't actually use any of the arguments it receives, other than to check if they exist or not...

UPDATE:

Try this:

splitData <- function(type) {
  set.seed(1337)
  rowTrain <- createDataPartition(y = cleaned.data$CHURN, p = 0.7, list = FALSE)
  bufferDF <- cleaned.data[-rowTrain,]
  rowTest <- createDataPartition(y = cleaned.data$CHURN, p = 0.50, list = FALSE)
  if(type == "train") {cdTrain <- cleaned.data[rowTrain,]
    return(cdTrain)}
  if(type == "test") {cdTest <- cleaned.data[rowTest,]
    return(cdTest)}
  if(type == "validate") {cdValidate <- bufferDF[-rowTest,]
    return(cdValidate)}
}

Note that "validate" will give you a very short list, because you're using -rowTest created from the full data set on the shorted bufferDF, which only includes 30% of the data set. You might want to replace the line defining rowTest with something like:

rowTest <- createDataPartition(y = bufferDF, p = 0.50, list = FALSE)

Which will give you a sample of 50% of the test data.

Collectives™ on Stack Overflow

Using if statements with text strings in function - R

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related