I have two
.csv files that one of them is
test.csv and the other one is
train.csv. However, as you can predict the test file does not have the
target column (‘y’ in this case) while train file has.
What I wanted to do is first using train file to train the system entirely, then using the test file just to see predictions.
from sklearn.model_selection import train_test_split() to create train and test examples but it accepts 1 file path only. I want to train the system using train file first, then when it finished I want to get test datas from
test.csv file and make the predictions.
So first I tried classic way but decreasing test size so It’ll be like "this file used for train only",
import pandas as pd from sklearn.svm import SVC dataset = pd.read_csv(r'path\train.csv', sep=",") X_train, X_test, y_train, y_test = train_test_split( X, y, test_size = 0.001, random_state = 45) clf = SVC(kernel = 'rbf') clf.fit(X_train, y_train)
but then, when it comes to real test part(which I want to use the data in test.csv that doesn’t have target values), how can I import test.csv somehow I can use the test data in trained model above
#get data from test.csv as somehow X_test clfPredict = clf.predict(X_test)
If this is not possible using
train_test_split(), what’s the proper way to accomplish this task?
You need to load the train CSV and split it to:
y_train = df1['Y column'] X_train = df1.drop('Y Column', axis = 1)
And regarding test:
X_test = df2
and y_test will be the result from clf.predict(X_test)
Answered By – gtomer
Answer Checked By – Jay B. (AngularFixing Admin)