Naive Bayes method is a classification method based on Bayes theorem and independent hypothesis of characteristic conditions . In short , Naive Bayes classifier assumes that each feature of the sample is irrelevant to other features . for instance , If a fruit has a red color , circular , About the diameter 4 Inch, etc , The fruit can be judged as an apple . Although these features are interdependent or some are determined by others , However, naive Bayes classifier considers that these attributes are independent in the probability distribution of whether the fruit is an apple or not . With these simple ideas and simplistic assumptions , But naive Bayes classifier can still achieve good results in many complex real situations . One advantage of naive Bayesian classifier is that it only needs to estimate the necessary parameters based on a small amount of training data （ Discrete variables are prior probability and class conditional probability , Continuous variables are the mean and variance of variables ）.

1. Bayesian classification model

The Bayesian classification model is as follows ：

among ,X Represents a property set ,Y Represents a class variable ,P(Y) Is a priori probability ,P(X|Y) Is the class conditional probability ,P(X) As evidence ,P(Y|X) Is a posteriori probability . Bayesian classification model is to use a priori probability P(Y), Quasi conditional probability P(X|Y) And evidence P(X) To represent a posteriori probability . In comparison Y The posterior probability of , Evidence in denominator P(X) Always constant , So it can be ignored . Prior probability P(Y) It can be easily estimated by calculating the proportion of training records belonging to each category in the training set . Pair conditional probability P(X|Y) Estimation of , Different implementations determine different Bayesian classification methods , The common ones are naive Bayesian classification and Bayesian belief network .

2. Naive Bayesian classification model

3. example

The data set is as follows ：

The prior probability calculated from the data set and the class conditional probability of each discrete attribute , Parameters of class conditional probability distribution of continuous attributes （ Sample mean and variance ） as follows ：

Prior probability ：P(Yes)=0.3;P(No)=0.7

P( There is a room = yes |No) = 3/7

P( There is a room = no |No) = 4/7

P( There is a room = yes |Yes) = 0

P( There is a room = no |Yes) = 1

P( marital status = single |No) = 2/7

P( marital status = divorce |No) = 1/7

P( marital status = married |No) = 4/7

P( marital status = single |Yes) = 2/3

P( marital status = divorce |Yes) = 1/3

P( marital status = married |Yes) = 0

annual income ：

If class =No： Sample mean =110; Sample variance =2975

If class =Yes： Sample mean =90; Sample variance =25

——》 Records to be predicted ：X={ There is a room = no , marital status = married , annual income =120K}

P(No)*P( There is a room = no |No)*P( marital status = married |No)*P( annual income =120K|No)=0.7*4/7*4/7*0.0072=0.0024

P(Yes)*P( There is a room = no |Yes)*P( marital status = married |Yes)*P( annual income =120K|Yes)=0.3*1*0*1.2*10-9=0

because 0.0024 greater than 0, Therefore, the record is classified as No.

As can be seen from the above example , If there is an attribute whose class conditional probability is equal to 0, Then the posterior probability of the whole class is equal to 0. The method of estimating the class conditional probability only by using the record proportion is too weak , Especially when there are few training samples and many attributes . The solution to this problem is to use the m Estimation method to estimate the conditional probability ：

Technology
Daily Recommendation
views 5