5. WRITE A PROGRAM TO IMPLEMENT THE NAÏVE BAYESIAN CLASSIFIER FOR A SAMPLE TRAINING DATA SET STORED AS A.CSV FILE. COMPUTE THE ACCURACY OF THE CLASSIFIER, CONSIDERING FEW TEST DATASETS.
SOLUTION
tennisdata.csv
lab5.py
import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv('tennisdata.csv')
print("The first 5 values of data is :\n",data.head())
X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())
y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())
le_outlook = LabelEncoder()
X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)
print("\nNow the Train data is :\n",X.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)
classifier = GaussianNB()
classifier.fit(X_train,y_train)
from sklearn.metrics import accuracy_score
print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))
STEPS & OUTPUT:
Run 3-4 Times u'll get different accuracies - "The output varies every time"
to view steps & output click HERE
SOLUTION
tennisdata.csv
Outlook
|
Temperature
|
Humidity
|
Windy
|
PlayTennis
|
Sunny
|
Hot
|
High
|
FALSE
|
No
|
Sunny
|
Hot
|
High
|
TRUE
|
No
|
Overcast
|
Hot
|
High
|
FALSE
|
Yes
|
Rainy
|
Mild
|
High
|
FALSE
|
Yes
|
Rainy
|
Cool
|
Normal
|
FALSE
|
Yes
|
Rainy
|
Cool
|
Normal
|
TRUE
|
No
|
Overcast
|
Cool
|
Normal
|
TRUE
|
Yes
|
Sunny
|
Mild
|
High
|
FALSE
|
No
|
Sunny
|
Cool
|
Normal
|
FALSE
|
Yes
|
Rainy
|
Mild
|
Normal
|
FALSE
|
Yes
|
Sunny
|
Mild
|
Normal
|
TRUE
|
Yes
|
Overcast
|
Mild
|
High
|
TRUE
|
Yes
|
Overcast
|
Hot
|
Normal
|
FALSE
|
Yes
|
Rainy
|
Mild
|
High
|
TRUE
|
No
|
lab5.py
import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv('tennisdata.csv')
print("The first 5 values of data is :\n",data.head())
X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())
y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())
le_outlook = LabelEncoder()
X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)
print("\nNow the Train data is :\n",X.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)
classifier = GaussianNB()
classifier.fit(X_train,y_train)
from sklearn.metrics import accuracy_score
print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))
STEPS & OUTPUT:
Run 3-4 Times u'll get different accuracies - "The output varies every time"
to view steps & output click HERE
No comments:
Post a Comment