My Brain Cells

Easiest (and best) learning materials for anyone with a curiosity for machine learning and artificial intelligence, Deep learning, Programming, and other fun life hacks.

PCA (Principal Component Analysis) using sklearn

 

For a lot of machine learning applications, it helps to be able to visualize your data. Visualizing 2 or 3-dimensional data is not that challenging. You can use PCA to reduce that 4-dimensional data into 2 or 3 dimensions so that you can plot and hopefully understand the data better.

Steps for the PCA:


1. load the data
 
2. standardize the data
 
         from sklearn.preprocessing import StandardScaler        

features = ['sepal length', 'sepal width', 'petal length', 'petal width']# Separating out the features
x = df.loc[:, features].values
# Separating out the target
y = df.loc[:,['target']].values
# Standardizing the features
x = StandardScaler().fit_transform(x)

3. Import PCA and build model:

from sklearn.decomposition import PCApca = PCA(n_components=2)principalComponents = pca.fit_transform(x)principalDf = pd.DataFrame(data = principalComponents
, columns = ['principal component 1', 'principal component 2'])
principalDf.head()
# Concatenate the clusters labels to the dataframepca_df = pd.concat([principalDf, pd.DataFram({'cluster':Features})], axis = 1)
Collab Link: Click Here

Anthony

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top