How to split data into train set (and test set) every nrows in R?

Issue

I’ve got a classification problem where I have a huge DATASET containing 308.500 data. I want to split these data into a train set and a test set in order to create a model.

But I want the train data to take, for example, sample for the DATASET every nrows, for example every 1.000 rows, so I know that the train set will be constructed by rows from all the DATASET. Is there a way to do this?

For example I’d like something like this:

train = DATASET[take sample every 1000 rows]

Solution

You can use seq to create indices of rows to subset.

train_inds <- seq(1, nrow(DATASET), 1000)
train <- DATASET[train_inds, ]
test <- DATASET[-train_inds, ]

Answered By – Ronak Shah

Answer Checked By – Cary Denson (AngularFixing Admin)

Leave a Reply

Your email address will not be published.