Unsupervised Learning in Python
Benjamin Wilson
Director of Research at lateral.io

PCA(n_components=2)samples = array of iris measurements (4 features)species = list of iris species numbersfrom sklearn.decomposition import PCApca = PCA(n_components=2)pca.fit(samples)
PCA(n_components=2)
transformed = pca.transform(samples)
print(transformed.shape)
(150, 2)
import matplotlib.pyplot as plt
xs = transformed[:,0]
ys = transformed[:,1]
plt.scatter(xs, ys, c=species)
plt.show()


scipy.sparse.csr_matrix instead of NumPy arraycsr_matrix remembers only the non-zero entries (saves space!)
PCA doesn't support csr_matrixTruncatedSVD insteadfrom sklearn.decomposition import TruncatedSVD
model = TruncatedSVD(n_components=3)
model.fit(documents)  # documents is csr_matrix
transformed = model.transform(documents)
Unsupervised Learning in Python