Wednesday, July 3, 2019

ML 5 - NAÏVE BAYESIAN CLASSIFIER

5. WRITE A PROGRAM TO IMPLEMENT THE NAÏVE BAYESIAN CLASSIFIER FOR A SAMPLE TRAINING DATA SET STORED AS A.CSV FILE. COMPUTE THE ACCURACY OF THE CLASSIFIER, CONSIDERING FEW TEST DATASETS.

 SOLUTION 

tennisdata.csv


Outlook
Temperature
Humidity
Windy
PlayTennis
Sunny
Hot
High
FALSE
No
Sunny
Hot
High
TRUE
No
Overcast
Hot
High
FALSE
Yes
Rainy
Mild
High
FALSE
Yes
Rainy
Cool
Normal
FALSE
Yes
Rainy
Cool
Normal
TRUE
No
Overcast
Cool
Normal
TRUE
Yes
Sunny
Mild
High
FALSE
No
Sunny
Cool
Normal
FALSE
Yes
Rainy
Mild
Normal
FALSE
Yes
Sunny
Mild
Normal
TRUE
Yes
Overcast
Mild
High
TRUE
Yes
Overcast
Hot
Normal
FALSE
Yes
Rainy
Mild
High
TRUE
No

lab5.py

import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB

data = pd.read_csv('tennisdata.csv')
print("The first 5 values of data is :\n",data.head())

X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())
y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())

le_outlook = LabelEncoder()
X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)

print("\nNow the Train data is :\n",X.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)
classifier = GaussianNB()
classifier.fit(X_train,y_train)

from sklearn.metrics import accuracy_score
print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))

STEPS & OUTPUT:

Run 3-4 Times u'll get different accuracies - "The output varies every time"

to view steps & output click HERE

No comments:

Post a Comment