Naive Bayes#
Note
Naive Bayes classifiers are a family of simple “probabilistic classifiers” based on applying Bayes’ theorem with simple(naive) independence assumptions between the features.
Discriminative and Generative Algorithms#
Discriminative:try to learn
Generative:try to learn
When predicting, use bayes rule:
then:
Algorithm#
Suppose
Naive bayes independent assumption:
Parameterized by:
note:
We can write down the joint log likelihood of the data:
Maximum likelihood estimate result:
Laplace Smoothing#
To predict unseen data, we add 1 to the numerator, add 2 to the denominator:
It’s actually Binary Bernoulli Naive Bayes and implemented in BernoulliNB.
It assumes each feature is binary-valued, if handed any other kind of data, a BernoulliNB instance may binarize its input.
Multinomial Naive Bayes#
MultinomialNB implements the naive Bayes algorithm for multinomially distributed data, and is one of the two classic naive Bayes variants used in text classification.
This distribution is parametrized by vectors
The parameters
where
Gaussian Naive Bayes#
When
Just like the binary case, parameters
This is Gaussian Naive Bayes and implemented in GaussianNB.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
gnb = GaussianNB()
y_pred = gnb.fit(X_train, y_train).predict(X_test)
print("Number of mislabeled points out of a total %d points : %d"
% (X_test.shape[0], (y_test != y_pred).sum()))