Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

Naive Bayes Classifier: Learning Naive Bayes with Python

$
0
0

In a world full of Machine Learning and Artificial Intelligence, surrounding almost everything around us, Classification and Prediction is one the most important aspects of Machine Learning andNaive Bayes is a simple but surprisingly powerful algorithm for predictive modeling. So Guys, in this Naive Bayes Tutorial, I’ll be covering the following topics:

What is Bayes Theorem? Game Prediction using Bayes’ Theorem Naive Bayes in the Industry Step By Step Implementation of Naive Bayes Naive Bayes with SKLEARN What is Naive Bayes?

Naive Bayes is among one of the most simple and powerful algorithms for classification based on Bayes’ Theorem with anassumption of independence among predictors. Naive Bayes model is easy to build and particularly useful for very large data sets. There are two parts to this algorithm:

Naive Bayes

The Naive Bayes classifier assumes that the presence of a feature in a class is unrelated to any other feature. Even if these features depend on each other or upon the existence of the other features, all of these properties independently contribute to the probability that a particular fruit is an apple or an orange or a banana and that is why it is known as “Naive”.

Let’s move forward with our Naive Bayes Tutorial Blog and understand Bayes Theorem.

What is Bayes Theorem?

In Statistics and probability theory, Bayes’ theoremdescribes the probability of an event, based on prior knowledge of conditions that might be related to the event. It serves as a way to figure out conditional probability.

Given a Hypothesis H and evidence E, Bayes’ Theorem states that the relationship between the probability of Hypothesis before getting the evidence P(H) and the probability of the hypothesis after getting the evidence P(H|E) is :


Naive Bayes Classifier: Learning Naive Bayes with Python

This relates the probability of the hypothesis before getting the evidence P(H) , to the probability of the hypothesis after getting the evidence, P(H|E) . For this reason, is called the prior probability , while P(H|E) is called the posterior probability . The factor that relates the two, P(H|E) / P(E) , is called the likelihood ratio . Using these terms, Bayes’ theorem can be rephrased as:

“The posterior probability equals the prior probability times the likelihood ratio.”

Go a little confused? Don’t worry.

Let’s continue our Naive Bayes Tutorial blog and understand this concept with a simple concept.

Learn python From Experts

Bayes’ Theorem Example

Let’s suppose we have a Deck of Cards, we wish to find out the “ Probability of the Card we picked at random to be a King given that it is a Face Card “. So, according to Bayes Theorem, we can solve this problem. First, we need to find out the probability

P(King) which is 4/52 as there are 4 Kings in a Deck of Cards. P(Face|King) is equal to 1 as all the Kings are faceCards. P(Face) is equal to 12/52 as there are 3 Face Cards in a Suit of 13 cards and there are 4 Suits in total.
Naive Bayes Classifier: Learning Naive Bayes with Python

Now, putting allthe values in the Bayes’ Equation we get the result as 1/3

Game Prediction using Bayes’ Theorem

Let’s continue our Naive Bayes Tutorial blog and Predict the Future of Playing with the weather data we have.

So here we have our Data, which comprises of the Day, Outlook, Humidity, Wind Conditions and the final column being Play, which we have to predict.


Naive Bayes Classifier: Learning Naive Bayes with Python
First, we will create a frequency table using each attribute of the dataset.
Naive Bayes Classifier: Learning Naive Bayes with Python
For each frequency table, we will generate a likelihood table.
Naive Bayes Classifier: Learning Naive Bayes with Python
Likelihood of ‘ Yes ’ given ‘ Sunny ‘ is:

P(c|x) = P(Yes|Sunny) = P(Sunny|Yes)* P(Yes) / P(Sunny) = (0.3 x 0.71) /0.36 = 0.591

Similarly Likelihood of ‘ No ’ given ‘ Sunny ‘ is:

P(c|x) = P(No|Sunny) = P(Sunny|No)* P(No) / P(Sunny) = (0.4 x 0.36) /0.36 = 0.40

Subscribe to our youtube channel to get new updates..! Now, in the same way, we need to create the Likelihood Table for other attributes as well.
Naive Bayes Classifier: Learning Naive Bayes with Python

Suppose we have a Day with the following values :

Outlook = Rain Humidity = High Wind = Weak Play=? So, with the data, we have to predict whether “we can play on that day or not”.

Likelihood of ‘Yes’ on that Day =P(Outlook = Rain|Yes)*P(Humidity= High|Yes)* P(Wind= Weak|Yes)*P(Yes)

= 2/9 * 3/9 * 6/9 * 9/14 = 0.0199

Likelihood of ‘No’ on that Day =P(Outlook = Rain|No)*P(Humidity= High|No)* P(Wind= Weak|No)*P(No)

= 2/5 * 4/5 * 2/5 * 5/14 = 0.0166

Now we normalize the values, then

P(Yes) = 0.0199 / (0.0199+ 0.0166) = 0.55

P(No) = 0.0166 / (0.0199+ 0.0166) =

Our model predicts that there is a 55% chance there will be a Game tomorrow. Naive Bayes in the Industry

Now that you have an idea of What exactly is Nave Bayes, how it works, let’s see where is it used in the Industry?

News Categorization:
Naive Bayes Classifier: Learning Naive Bayes with Python

Starting with our first industrial use, it is News Categorization, or we can use the term text classification to broaden the spectrum of this algorithm. News on the web is rapidly growing where each news site has its own different layout and categorization for grouping news. Companies use a web crawler to extract useful text from HTML pages of news article contents to construct a Full-Text-RSS. Each news article contents is tokenized (categorized). In order to achieve better classification result, we remove the less significant words i.e. stop word from the document. We apply the naive Bayes classifier for classification of news contents based on news code.

Spam Filtering:

Viewing all articles
Browse latest Browse all 9596

Trending Articles