plotting Iris Classification

Issue

The code below classifies three groups of Iris through the Decision Tree classifier.

import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, KFold
from sklearn.tree import DecisionTreeClassifier
iris = datasets.load_iris()
dataset = pd.DataFrame(iris['data'], columns=iris['feature_names'])
dataset['target'] = iris['target']
X=dataset[[dataset.columns[1], dataset.columns[2]]]
y=dataset['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
model = DecisionTreeClassifier(max_depth=3)
model.fit(X_train, y_train)

And For plotting this classification we can use these lines of code:

import numpy as np
from matplotlib.colors import ListedColormap
X_set, y_set = X_test.values, y_test.values
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, model.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green','blue')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green','blue'))(i), label = j)
plt.title('Classifier (Test set)')
plt.xlabel('sepal width (cm)')
plt.ylabel('petal length (cm)')
plt.legend()
plt.show()

the result would be like below:
Visualising the Test set results

But when I wanted to use more than two features for training,

X=dataset[[dataset.columns[1], dataset.columns[2], dataset.columns[3]]]
y=dataset['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)

I couldn’t visualize the results like the picture above! Could someone please explain to me how I can visualize the results?
Thank you

Solution

Since you’ve 3 data and its corresponding label, you can only show it in a 3D plot.
I’ve tried to do that in the following code:

%matplotlib notebook
from sklearn.linear_model import Ridge
X_set, y_set = X_test.values, y_test.values
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop =X_set[:, 1].max() + 1, step = 0.01))
model = Ridge()
model.fit(np.array([X_set[:, 0],X_set[:, 1]]).T,X_set[:,2])
X3=model.predict(np.array([X1.flatten(),X2.flatten()]).T)
 
fig = plt.figure(figsize=(10,10))

ax = fig.add_subplot(111, projection='3d')
Dict={0:'red',1:'blue',2:'purple'}
ax.plot_surface(X1, X2, X3.reshape(X1.shape), cmap="YlGn", linewidth=0, antialiased=False, alpha=0.5)
for Id in range(X_set.shape[0]):
    ax.scatter3D(*X_set[Id,:],color=Dict[y_set[Id]],linewidths=10)
ax.set_xlabel("Data_1")
ax.set_ylabel('Data_2')
ax.set_zlabel("Data_3")

plt.show()

Also since ax.plot_surface wants given shapes as X1.shape=X2.shape=X3.shape, I have predicted X3 values with a linear model(If you use a tree model it gives a different shape).

enter image description here

One can ask why we haven’t used a meshgrid for the 3 data features and create a 3d plot with it. The reason for that is matplotlib plot_surface or 3dcountrp. just accepts 2d params and meshgrid with 3 features returns 3d data for each.

Hope that questions your answer.

Answered By – Hakan Akgün

Answer Checked By – Mary Flores (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.