— A statistical technique for classifying based mostly on Bayes Theorem. Supervised Studying. It assumes independence between options.
2. Why did we’d like that?
— For simplicity, multi-class prediction issues and restricted coaching information are wanted.
3. When to make use of it?
— Restricted coaching information, multi-class classification, high-dimensional information, and real-time purposes.
4. The place is it generally utilized?
— Spam filtering, textual content classification, recommender methods.
5. How one can apply it?
- If crucial, convert textual content information into numerical options utilizing strategies like CountVectorizer, which tokenizes sentences and counts phrase frequency.
- Create a Naive Bayes classifier utilizing MultinominalNB or GaussianNB, it relies upon
- Match the coaching information utilizing match()
- Make a prediction utilizing predict()
- Efficiency metrics.
6. Key phrases
System:
- Ck is the k-th class
- x=(x1,…,xn)x=(x1,…,xn) is the characteristic vector
- P(Ck∣x) is the posterior likelihood of sophistication Ck given x
- P(Ck) is the prior likelihood of sophistication Ck
- P(xi∣Ck) is the probability of characteristic xi given class Ck
- P(x) is the proof (which might be ignored because it’s fixed for all courses)
Numerical Instance:
Suppose we wish to predict whether or not it’s going to rain based mostly on two options: temperature (Scorching or Cool) and humidity (Excessive or Regular). We’ve got a small coaching dataset:
Step 1: Calculate prior chances
P(Rain=Sure) = 3/5 = 0.6
P(Rain=No) = 2/5 = 0.4
Step 2: Calculate conditional chances
P(Cool|Rain=Sure) = 1/3
P(Cool|Rain=No) = 2/2 = 1
P(Excessive|Rain=Sure) = 3/3 = 1
P(Excessive|Rain=No) = 0/2 = 0
Step 3: Apply Naive Bayes method
P(Rain=Sure|Cool,Excessive) ∝ P(Rain=Sure) * P(Cool|Rain=Sure) * P(Excessive|Rain=Sure)
= 0.6 * (1/3) * 1 = 0.2
P(Rain=No|Cool,Excessive) ∝ P(Rain=No) * P(Cool|Rain=No) * P(Excessive|Rain=No)
= 0.4 * 1 * 0 = 0
Step 4: Normalize chances
Whole = 0.2 + 0 = 0.2
P(Rain=Sure|Cool,Excessive) = 0.2 / 0.2 = 1
P(Rain=No|Cool,Excessive) = 0 / 0.2 = 0
Due to this fact, the mannequin predicts that it’s going to rain with a likelihood of 100%.
Code Clarification:
The Questions Arose:
- Bayes Theorem
- CountVectorizer, vectorizers
- NB’s