This example shows how to train a Generalized Additive Model (GAM) for Binary Classification with optimal parameters and how to assess the predictive performance of the trained model. The example first finds the optimal parameter values for a univariate GAM (parameters for linear terms) and then finds the values for a bivariate GAM (parameters for interaction terms). Also, the example explains how to interpret the trained model by examining local effects of terms on a specific prediction and by computing the partial dependence of the predictions on predictors.
Load the 1994 census data stored in census1994.mat
. The data set consists of demographic data from the US Census Bureau to predict whether an individual makes over $50,000 per year. The classification task is to fit a model that predicts the salary category of people given their age, working class, education level, marital status, race, and so on.
load census1994
census1994
contains the training data set adultdata
and the test data set adulttest
. To reduce the running time for this example, subsample 500 training observations and 500 test observations by using the datasample
function.
rng(1) % For reproducibility NumSamples = 5e2; adultdata = datasample(adultdata,NumSamples,'Replace',false); adulttest = datasample(adulttest,NumSamples,'Replace',false);
Train a GAM with hyperparameters that minimize the cross-validation loss by using the OptimizeHyperparameters name-value argument.
You can specify OptimizeHyperparameters
as 'auto'
or 'all'
to find optimal hyperparameter values for both univariate and bivariate parameters. Alternatively, you can find optimal values for univariate parameters using the 'auto-univariate'
or 'all-univariate'
option, and then find optimal values for bivariate parameters using the 'auto-bivariate'
or 'all-bivariate'
option. This example uses 'auto-univariate'
and 'auto-bivariate'
.
Train a univariate GAM. Specify OptimizeHyperparameters
as 'auto-univariate'
so that fitcgam
finds optimal values of the InitialLearnRateForPredictors
and NumTreesPerPredictor
name-value arguments. For reproducibility, use the 'expected-improvement-plus'
acquisition function. Specify ShowPlots
as false
and Verbose
as 0 to disable plot and message displays, respectively.
Mdl_univariate = fitcgam(adultdata,'salary','Weights','fnlwgt', ... 'OptimizeHyperparameters','auto-univariate', ... 'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName','expected-improvement-plus', ... 'ShowPlots',false,'Verbose',0))
Mdl_univariate = ClassificationGAM PredictorNames: {'age' 'workClass' 'education' 'education_num' 'marital_status' 'occupation' 'relationship' 'race' 'sex' 'capital_gain' 'capital_loss' 'hours_per_week' 'native_country'} ResponseName: 'salary' CategoricalPredictors: [2 3 5 6 7 8 9 13] ClassNames: [<=50K >50K] ScoreTransform: 'logit' Intercept: -1.3118 NumObservations: 500 HyperparameterOptimizationResults: [1×1 BayesianOptimization] Properties, Methods
fitcgam
returns a ClassificationGAM
model object that uses the best estimated feasible point. The best estimated feasible point indicates the set of hyperparameters that minimizes the upper confidence bound of the objective function value based on the underlying objective function model of the Bayesian optimization process. You can obtain the best point from the HyperparameterOptimizationResults
property or by using the bestPoint
function.
x = Mdl_univariate.HyperparameterOptimizationResults.XAtMinEstimatedObjective
x=1×2 table
InitialLearnRateForPredictors NumTreesPerPredictor
_____________________________ ____________________
0.02257 118
bestPoint(Mdl_univariate.HyperparameterOptimizationResults)
ans=1×2 table
InitialLearnRateForPredictors NumTreesPerPredictor
_____________________________ ____________________
0.02257 118
Train a bivariate GAM. Specify OptimizeHyperparameters
as 'auto-bivariate'
so that fitcgam
finds optimal values of the Interactions
, InitialLearnRateForInteractions
, and NumTreesPerInteraction
name-value arguments. Use the univariate parameter values in x
so that the software finds optimal parameter values for interaction terms based on the x values.
Mdl = fitcgam(adultdata,'salary','Weights','fnlwgt', ... 'InitialLearnRateForPredictors',x.InitialLearnRateForPredictors, ... 'NumTreesPerPredictor',x.NumTreesPerPredictor, ... 'OptimizeHyperparameters','auto-bivariate', ... 'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName','expected-improvement-plus', ... 'ShowPlots',false,'Verbose',0))
Mdl = ClassificationGAM PredictorNames: {'age' 'workClass' 'education' 'education_num' 'marital_status' 'occupation' 'relationship' 'race' 'sex' 'capital_gain' 'capital_loss' 'hours_per_week' 'native_country'} ResponseName: 'salary' CategoricalPredictors: [2 3 5 6 7 8 9 13] ClassNames: [<=50K >50K] ScoreTransform: 'logit' Intercept: -1.4587 Interactions: [6×2 double] NumObservations: 500 HyperparameterOptimizationResults: [1×1 BayesianOptimization] Properties, Methods
Display the optimal bivariate hyperparameters.
Mdl.HyperparameterOptimizationResults.XAtMinEstimatedObjective
ans=1×3 table
Interactions InitialLearnRateForInteractions NumTreesPerInteraction
____________ _______________________________ ______________________
6 0.0061954 422
The model display of Mdl
shows a partial list of the model properties. To view the full list of the model properties, double-click the variable name Mdl
in the Workspace. The Variables editor opens for Mdl
. Alternatively, you can display the properties in the Command Window by using dot notation. For example, display the ReasonForTermination
property.
Mdl.ReasonForTermination
ans = struct with fields:
PredictorTrees: 'Terminated after training the requested number of trees.'
InteractionTrees: 'Terminated after training the requested number of trees.'
You can use the ReasonForTermination
property to determine whether the trained model contains the specified number of trees for each linear term and each interaction term.
Display the interaction terms in Mdl
.
Mdl.Interactions
ans = 6×2
5 12
1 6
6 12
1 12
7 9
2 6
Each row of Interactions
represents one interaction term and contains the column indexes of the predictor variables for the interaction term. You can use the Interactions
property to check the interaction terms in the model and the order in which fitcgam
adds them to the model.
Display the interaction terms in Mdl
using the predictor names.
Mdl.PredictorNames(Mdl.Interactions)
ans = 6×2 cell
{'marital_status'} {'hours_per_week'}
{'age' } {'occupation' }
{'occupation' } {'hours_per_week'}
{'age' } {'hours_per_week'}
{'relationship' } {'sex' }
{'workClass' } {'occupation' }
Matlabsolutions.com provides guaranteed satisfaction with a
commitment to complete the work within time. Combined with our meticulous work ethics and extensive domain
experience, We are the ideal partner for all your homework/assignment needs. We pledge to provide 24*7 support
to dissolve all your academic doubts. We are composed of 300+ esteemed Matlab and other experts who have been
empanelled after extensive research and quality check.
Matlabsolutions.com provides undivided attention to each Matlab
assignment order with a methodical approach to solution. Our network span is not restricted to US, UK and Australia rather extends to countries like Singapore, Canada and UAE. Our Matlab assignment help services
include Image Processing Assignments, Electrical Engineering Assignments, Matlab homework help, Matlab Research Paper help, Matlab Simulink help. Get your work
done at the best price in industry.
Desktop Basics - MATLAB & Simulink
Array Indexing - MATLAB & Simulink
Workspace Variables - MATLAB & Simulink
Text and Characters - MATLAB & Simulink
Calling Functions - MATLAB & Simulink
2-D and 3-D Plots - MATLAB & Simulink
Programming and Scripts - MATLAB & Simulink
Help and Documentation - MATLAB & Simulink
Creating, Concatenating, and Expanding Matrices - MATLAB & Simulink
Removing Rows or Columns from a Matrix
Reshaping and Rearranging Arrays
Add Title and Axis Labels to Chart
Change Color Scheme Using a Colormap
How Surface Plot Data Relates to a Colormap
How Image Data Relates to a Colormap
Time-Domain Response Data and Plots
Time-Domain Responses of Discrete-Time Model
Time-Domain Responses of MIMO Model
Time-Domain Responses of Multiple Models
Introduction: PID Controller Design
Introduction: Root Locus Controller Design
Introduction: Frequency Domain Methods for Controller Design
DC Motor Speed: PID Controller Design
DC Motor Position: PID Controller Design
Cruise Control: PID Controller Design
Suspension: Root Locus Controller Design
Aircraft Pitch: Root Locus Controller Design
Inverted Pendulum: Root Locus Controller Design
Get Started with Deep Network Designer
Create Simple Image Classification Network Using Deep Network Designer
Build Networks with Deep Network Designer
Classify Image Using GoogLeNet
Classify Webcam Images Using Deep Learning
Transfer Learning with Deep Network Designer
Train Deep Learning Network to Classify New Images
Deep Learning Processor Customization and IP Generation
Prototype Deep Learning Networks on FPGA
Deep Learning Processor Architecture
Deep Learning INT8 Quantization
Quantization of Deep Neural Networks
Custom Processor Configuration Workflow
Estimate Performance of Deep Learning Network by Using Custom Processor Configuration
Preprocess Images for Deep Learning
Preprocess Volumes for Deep Learning
Transfer Learning Using AlexNet
Time Series Forecasting Using Deep Learning
Create Simple Sequence Classification Network Using Deep Network Designer
Train Classification Models in Classification Learner App
Train Regression Models in Regression Learner App
Explore the Random Number Generation UI
Logistic regression create generalized linear regression model - MATLAB fitglm 2
Support Vector Machines for Binary Classification
Support Vector Machines for Binary Classification 2
Support Vector Machines for Binary Classification 3
Support Vector Machines for Binary Classification 4
Support Vector Machines for Binary Classification 5
Assess Neural Network Classifier Performance
Discriminant Analysis Classification
Train Generalized Additive Model for Binary Classification
Train Generalized Additive Model for Binary Classification 2
Classification Using Nearest Neighbors
Classification Using Nearest Neighbors 2
Classification Using Nearest Neighbors 3
Classification Using Nearest Neighbors 4
Classification Using Nearest Neighbors 5
Gaussian Process Regression Models
Gaussian Process Regression Models 2
Understanding Support Vector Machine Regression
Extract Voices from Music Signal
Align Signals with Different Start Times
Find a Signal in a Measurement
Extract Features of a Clock Signal
Filtering Data With Signal Processing Toolbox Software
Find Periodicity Using Frequency Analysis
Find and Track Ridges Using Reassigned Spectrogram
Classify ECG Signals Using Long Short-Term Memory Networks
Waveform Segmentation Using Deep Learning
Label Signal Attributes, Regions of Interest, and Points
Introduction to Streaming Signal Processing in MATLAB
Filter Frames of a Noisy Sine Wave Signal in MATLAB
Filter Frames of a Noisy Sine Wave Signal in Simulink
Lowpass Filter Design in MATLAB
Tunable Lowpass Filtering of Noisy Input in Simulink
Signal Processing Acceleration Through Code Generation
Signal Visualization and Measurements in MATLAB
Estimate the Power Spectrum in MATLAB
Design of Decimators and Interpolators
Multirate Filtering in MATLAB and Simulink