I was not able to find the answer to this anywhere. I have data for three months, where I would like to split it into the first two months(‘Jan-19’, ‘Feb-19’) as training set and the last month as the test (‘Mar-19’).
Previously I have done random sampling with simple code like this:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30,random_state=109)
and before that assigned y as the label and x as the columns to use to predict. I’m not sure how to assign the test and training to the months I want.
If your data is in a pandas dataframe, you can use subsetting like this:
X_train = X[X['month'] != 'Mar-19'] y_train = y[X['month'] != 'Mar-19'] X_test = X[X['month'] == 'Mar-19'] y_test = y[X['month'] == 'Mar-19']
Answered By – josephjscheidt
Answer Checked By – Willingham (AngularFixing Volunteer)