My Brain Cells

Easiest (and best) learning materials for anyone with a curiosity for machine learning and artificial intelligence, Deep learning, Programming, and other fun life hacks.

Naive Bayes using sklearn

 

What is the Naive Bayes algorithm?

It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.

The Naive Bayes model is easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.

Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x), and P(x|c). Look at the equation below:

naive bayes, bayes theorem

Step 1: Convert the data set into a frequency table

Step 2: Create a Likelihood table by finding the probabilities like Overcast probability = 0.29 and probability of playing is 0.64.

naive bayes, probability, example

Step 3: Now, use Naive Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of prediction.

Naive Bayes using sklearn:

Code:
import pandas as pd
data=pd.read_csv(‘IRIS.csv’)
data
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
145 6.7 3.0 5.2 2.3 Iris-virginica
146 6.3 2.5 5.0 1.9 Iris-virginica
147 6.5 3.0 5.2 2.0 Iris-virginica
148 6.2 3.4 5.4 2.3 Iris-virginica
149 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 5 columns

x = data.iloc[:,0:4].values  
y = data.iloc[:,4].values
from sklearn.model_selection import train_test_split  
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25
random_state = 1)
from sklearn.naive_bayes import GaussianNB  
classifier = GaussianNB()  
classifier.fit(x_train, y_train)  
>>GaussianNB(priors=None, var_smoothing=1e-09)

y_pred = classifier.predict(x_test)  
y_pred
my_test=[[5.13.51.40.2]]
print(classifier.predict(my_test))
>>[‘Iris-setosa’]

Anthony

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top