The naive Bayes classifier is designed for use when predictors are independent of one another within each class, but it appears to work well in practice even when that independence assumption is not valid. It classifies data in two steps:
Training step: Using the training data, the method estimates the parameters of a probability distribution, assuming predictors are conditionally independent given the class.
Prediction step: For any unseen test data, the method computes the posterior probability of that sample belonging to each class. The method then classifies the test data according the largest posterior probability.
The class-conditional independence assumption greatly simplifies the training step since you can estimate the one-dimensional class-conditional density for each predictor individually. While the class-conditional independence between predictors is not true in general, research shows that this optimistic assumption works well in practice. This assumption of class-conditional independence of the predictors allows the naive Bayes classifier to estimate the parameters required for accurate classification while using less training data than many other classifiers. This makes it particularly effective for data sets containing many predictors.
The training step in naive Bayes classification is based on estimating P(X|Y), the probability or probability density of predictors X
given class Y
. The naive Bayes classification model ClassificationNaiveBayes
and training function fitcnb
provide support for normal (Gaussian), kernel, multinomial, and multivariate, multinomial predictor conditional distributions. To specify distributions for the predictors, use the DistributionNames
name-value pair argument of fitcnb
. You can specify one type of distribution for all predictors by supplying the character vector or string scalar corresponding to the distribution name, or specify different distributions for the predictors by supplying a length D string array or cell array of character vectors, where D is the number of predictors (that is, the number of columns of X).
The 'normal'
distribution (specify using 'normal'
) is appropriate for predictors that have normal distributions in each class. For each predictor you model with a normal distribution, the naive Bayes classifier estimates a separate normal distribution for each class by computing the mean and standard deviation of the training data in that class.
The 'kernel'
distribution (specify using 'kernel'
) is appropriate for predictors that have a continuous distribution. It does not require a strong assumption such as a normal distribution and you can use it in cases where the distribution of a predictor may be skewed or have multiple peaks or modes. It requires more computing time and more memory than the normal distribution. For each predictor you model with a kernel distribution, the naive Bayes classifier computes a separate kernel density estimate for each class based on the training data for that class. By default the kernel is the normal kernel, and the classifier selects a width automatically for each class and predictor. The software supports specifying different kernels for each predictor, and different widths for each predictor or class.
The multivariate, multinomial distribution (specify using 'mvmn'
) is appropriate for a predictor whose observations are categorical. Naive Bayes classifier construction using a multivariate multinomial predictor is described below. To illustrate the steps, consider an example where observations are labeled 0, 1, or 2, and a predictor the weather when the sample was conducted.
Record the distinct categories represented in the observations of the entire predictor. For example, the distinct categories (or predictor levels) might include sunny, rain, snow, and cloudy.
Separate the observations by response class. For example, segregate observations labeled 0 from observations labeled 1 and 2, and observations labeled 1 from observations labeled 2.
For each response class, fit a multinomial model using the category relative frequencies and total number of observations. For example, for observations labeled 0, the estimated probability it was sunny is psunny?0 = (number of sunny observations with label 0)/(number of observations with label 0), and similar for the other categories and response labels.
The class-conditional, multinomial random variables comprise a multivariate multinomial random variable.
Here are some other properties of naive Bayes classifiers that use multivariate multinomial.
For each predictor you model with a multivariate multinomial distribution, the naive Bayes classifier:
Records a separate set of distinct predictor levels for each predictor
Computes a separate set of probabilities for the set of predictor levels for each class.
The software supports modeling continuous predictors as multivariate multinomial. In this case, the predictor levels are the distinct occurrences of a measurement. This can lead a predictor having many predictor levels. It is good practice to discretize such predictors.
If an observation is a set of successes for various categories (represented by all of the predictors) out of a fixed number of independent trials, then specify that the predictors comprise a multinomial distribution.
The multinomial distribution (specify using 'DistributionNames','mn'
) is appropriate when, given the class, each observation is a multinomial random variable. That is, observation, or row, j of the predictor data X represents D categories, where xjd is the number of successes for category (i.e., predictor) d in nj=D?d=1xjd independent trials. The steps to train a naive Bayes classifier are outlined next.
For each class, fit a multinomial distribution for the predictors given the class by:
Aggregating the weighted, category counts over all observations. Additionally, the software implements additive smoothing [1].
Estimating the D category probabilities within each class using the aggregated category counts. These category probabilities compose the probability parameters of the multinomial distribution.
Let a new observation have a total count of m. Then, the naive Bayes classifier:
Sets the total count parameter of each multinomial distribution to m
For each class, estimates the class posterior probability using the estimated multinomial distributions
Predicts the observation into the class corresponding to the highest posterior probability
Consider the so-called the bag-of-tokens model, where there is a bag containing a number of tokens of various types and proportions. Each predictor represents a distinct type of token in the bag, an observation is n independent draws (i.e., with replacement) of tokens from the bag, and the data is a vector of counts, where element d is the number of times token d appears.
A machine-learning application is the construction of an email spam classifier, where each predictor represents a word, character, or phrase (i.e., token), an observation is an email, and the data are counts of the tokens in the email. One predictor might count the number of exclamation points, another might count the number of times the word "money" appears, and another might count the number of times the recipient's name appears. This is a naive Bayes model under the further assumption that the total number of tokens (or the total document length) is independent of response class.
Other properties of naive Bayes classifiers that use multinomial observations include:
Classification is based on the relative frequencies of the categories. If nj = 0 for observation j, then classification is not possible for that observation.
The predictors are not conditionally independent since they must sum to nj.
Naive Bayes is not appropriate when nj provides information about the class. That is, this classifier requires that nj is independent of the class.
If you specify that the predictors are conditionally multinomial, then the software applies this specification to all predictors. In other words, you cannot include 'mn'
in a cell array when specifying 'DistributionNames'
.
If a predictor is categorical, i.e., is multinomial within a response class, then specify that it is multivariate multinomial.
Matlabsolutions.com provides guaranteed satisfaction with a
commitment to complete the work within time. Combined with our meticulous work ethics and extensive domain
experience, We are the ideal partner for all your homework/assignment needs. We pledge to provide 24*7 support
to dissolve all your academic doubts. We are composed of 300+ esteemed Matlab and other experts who have been
empanelled after extensive research and quality check.
Matlabsolutions.com provides undivided attention to each Matlab
assignment order with a methodical approach to solution. Our network span is not restricted to US, UK and Australia rather extends to countries like Singapore, Canada and UAE. Our Matlab assignment help services
include Image Processing Assignments, Electrical Engineering Assignments, Matlab homework help, Matlab Research Paper help, Matlab Simulink help. Get your work
done at the best price in industry.
Desktop Basics - MATLAB & Simulink
Array Indexing - MATLAB & Simulink
Workspace Variables - MATLAB & Simulink
Text and Characters - MATLAB & Simulink
Calling Functions - MATLAB & Simulink
2-D and 3-D Plots - MATLAB & Simulink
Programming and Scripts - MATLAB & Simulink
Help and Documentation - MATLAB & Simulink
Creating, Concatenating, and Expanding Matrices - MATLAB & Simulink
Removing Rows or Columns from a Matrix
Reshaping and Rearranging Arrays
Add Title and Axis Labels to Chart
Change Color Scheme Using a Colormap
How Surface Plot Data Relates to a Colormap
How Image Data Relates to a Colormap
Time-Domain Response Data and Plots
Time-Domain Responses of Discrete-Time Model
Time-Domain Responses of MIMO Model
Time-Domain Responses of Multiple Models
Introduction: PID Controller Design
Introduction: Root Locus Controller Design
Introduction: Frequency Domain Methods for Controller Design
DC Motor Speed: PID Controller Design
DC Motor Position: PID Controller Design
Cruise Control: PID Controller Design
Suspension: Root Locus Controller Design
Aircraft Pitch: Root Locus Controller Design
Inverted Pendulum: Root Locus Controller Design
Get Started with Deep Network Designer
Create Simple Image Classification Network Using Deep Network Designer
Build Networks with Deep Network Designer
Classify Image Using GoogLeNet
Classify Webcam Images Using Deep Learning
Transfer Learning with Deep Network Designer
Train Deep Learning Network to Classify New Images
Deep Learning Processor Customization and IP Generation
Prototype Deep Learning Networks on FPGA
Deep Learning Processor Architecture
Deep Learning INT8 Quantization
Quantization of Deep Neural Networks
Custom Processor Configuration Workflow
Estimate Performance of Deep Learning Network by Using Custom Processor Configuration
Preprocess Images for Deep Learning
Preprocess Volumes for Deep Learning
Transfer Learning Using AlexNet
Time Series Forecasting Using Deep Learning
Create Simple Sequence Classification Network Using Deep Network Designer
Train Classification Models in Classification Learner App
Train Regression Models in Regression Learner App
Explore the Random Number Generation UI
Logistic regression create generalized linear regression model - MATLAB fitglm 2
Support Vector Machines for Binary Classification
Support Vector Machines for Binary Classification 2
Support Vector Machines for Binary Classification 3
Support Vector Machines for Binary Classification 4
Support Vector Machines for Binary Classification 5
Assess Neural Network Classifier Performance
Discriminant Analysis Classification
Train Generalized Additive Model for Binary Classification
Train Generalized Additive Model for Binary Classification 2
Classification Using Nearest Neighbors
Classification Using Nearest Neighbors 2
Classification Using Nearest Neighbors 3
Classification Using Nearest Neighbors 4
Classification Using Nearest Neighbors 5
Gaussian Process Regression Models
Gaussian Process Regression Models 2
Understanding Support Vector Machine Regression
Extract Voices from Music Signal
Align Signals with Different Start Times
Find a Signal in a Measurement
Extract Features of a Clock Signal
Filtering Data With Signal Processing Toolbox Software
Find Periodicity Using Frequency Analysis
Find and Track Ridges Using Reassigned Spectrogram
Classify ECG Signals Using Long Short-Term Memory Networks
Waveform Segmentation Using Deep Learning
Label Signal Attributes, Regions of Interest, and Points
Introduction to Streaming Signal Processing in MATLAB
Filter Frames of a Noisy Sine Wave Signal in MATLAB
Filter Frames of a Noisy Sine Wave Signal in Simulink
Lowpass Filter Design in MATLAB
Tunable Lowpass Filtering of Noisy Input in Simulink
Signal Processing Acceleration Through Code Generation
Signal Visualization and Measurements in MATLAB
Estimate the Power Spectrum in MATLAB
Design of Decimators and Interpolators
Multirate Filtering in MATLAB and Simulink