I don’t often use the scikit-learn library, so I thought I’d do a quick demo just to refresh my memory. The scikit-learn library is a collection of python code modules that can do machine learning tasks.
I like Python, but the language has a lot of moving parts. For example, at a minimum you need base Python, plus the NumPy library for numeric code, plus the SciPy library for arrays and matrices, and so on. Managing all these components can be a real pain, so I usually use the Anaconda distribution which wraps all these libraries up.

Anaconda comes with the Spyder IDE for Python, which I don’t really like that much. But it’ usable.
I culled demo code from various sources on the Internet. The idea is to create a classification model for the famous Fisher Iris Dataset. My demo script begins:
from sklearn import datasetsfrom sklearn import metrics
from sklearn.svm import SVC
# load the iris datasets
dataset = datasets.load_iris()
print(dataset.data)
print(dataset.target)
Next I create the model and make predictions:
# fit a SVM model to the datamodel = SVC()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
The last part of the demo creates a Principal Component Analysis graph:
print(__doc__)import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.decomposition import PCA
iris = datasets.load_iris()
X = iris.data
y = iris.target
target_names = iris.target_names
pca = PCA(n_components= 2)
X_r = pca.fit(X).transform(X)
plt.figure()
colors = ['navy', 'turquoise', 'darkorange']
lw = 2
for color, i, target_name in zip(colors,
[0, 1, 2], target_names):
plt.scatter(X_r[y == i, 0],
X_r[y == i, 1], color=color,
alpha=.8, lw=lw,
label=target_name)
plt.legend(loc='best', shadow=False,
scatterpoints=1)
plt.title('PCA of IRIS dataset')
plt.show()
Compared to my usual programming language and environment, C# and Visual Studio, Python and Spyder are very primitive. But Python has a much better set of ML libraries.