\n",
" \n",
" Model | \n",
" Common Usage | \n",
" Suggested Usage | \n",
" Suggested Scale | \n",
" Interpretability | \n",
" Common Concerns | \n",
"
\n",
" \n",
" Linear Regression | \n",
" Supervised regression | \n",
" Multiple linear regression, simple linear regression | \n",
" Small to large datasets | \n",
" High | \n",
" Missing values, outliers, standardization, parameter tuning | \n",
"
\n",
" \n",
" Polynomial Regression | \n",
" Supervised regression | \n",
" Modeling non-linear data using a linear model, analyze the curve towards the end for signs of overfitting, often used when linear models are unclear | \n",
" Small to large datasets | \n",
" High | \n",
" Missing values, outliers, overfitting, standardization, parameter tuning | \n",
"
\n",
" \n",
" Logistic Regression | \n",
" Supervised classification | \n",
" Most commonly used in classification but can be used in regression modeling, dependent variable (target) is categorical | \n",
" Small to large datasets | \n",
" High | \n",
" Missing values, outliers, standardization, parameter tuning | \n",
"
\n",
" \n",
" Penalized Regression | \n",
" Supervised regression, Supervised classification | \n",
" Modeling linear or linearly separable phenomena, manually specifying nonlinear and explicit interaction terms, well suited for N << p , where the number of predictors exceeds the number of samples {cite:p}'brownlee2022bigp', specific types of penalized regression (Bayesian Linear Regression, Ridge, Lasso, Elastic Net) | \n",
" Small to large datasets | \n",
" High | \n",
" Missing values, outliers, standardization, parameter tuning | \n",
"
\n",
" \n",
" Naïve Bayes | \n",
" Supervised classification | \n",
" Modeling linearly separable phenomena in large datasets, well-suited for extremely large datasets where complex methods are intractable | \n",
" Small to extremely large datasets | \n",
" Moderate | \n",
" Strong linear independence assumption, infrequent categorical levels | \n",
"
\n",
" \n",
" Decision Trees | \n",
" Supervised regression, Supervised classification | \n",
" Modeling nonlinear and nonlinearly separable phenomena in large, dirty data, interactions are considered automatically but implicitly, missing values and outliers in input variables handled automatically in many implementations, decision tree ensembles (e.g., random forests and gradient boosting) can increase prediction accuracy and decrease overfitting | \n",
" Medium to large datasets | \n",
" Moderate | \n",
" Instability with small training datasets, gradient boosting can be unstable with noise or outliers, overfitting, parameter tuning | \n",
"
\n",
" \n",
" Support Vector Machines (SVM) | \n",
" Supervised regression, Supervised classification | \n",
" Modeling linear or linearly separable phenomena using linear kernels, modeling nonlinear or nonlinearly separable phenomena using nonlinear kernels, anomaly detection with one-class SVM (OSVM) | \n",
" Small to large datasets for linear kernels, Small to medium datasets for nonlinear kernels | \n",
" Low | \n",
" Missing values, overfitting, outliers, standardization, parameter tuning, accuracy versus deep neural networks depends on the choice of the nonlinear kernel; Gaussian and polynomial are often less accurate {cite:p}'singh2019svm' | \n",
"
\n",
" \n",
" k-Nearest Neighbors (kNN) | \n",
" Supervised regression, Supervised classification | \n",
" Modeling nonlinearly separable phenomena, can be used to match the accuracy of more sophisticated techniques, but with fewer tuning parameters | \n",
" Small to medium datasets | \n",
" Low | \n",
" Missing values, overfitting, outliers, standardization, curse of dimensionality | \n",
"
\n",
" \n",
" Neural Networks (NN) | \n",
" Supervised regression, Supervised classification | \n",
" Modeling nonlinear and nonlinearly separable phenomena, deep neural networks (e.g., deep learning) are well suited for state-of-the-art pattern recognition in images, videos, and sound, all interactions considered in fully connected, multilayer topologies, nonlinear feature extraction with auto-encoder and restricted Boltzmann machine (RBM) networks | \n",
" Usually medium to large datasets. | \n",
" Low | \n",
" Missing values, overfitting, outliers, standardization, hyperparameter tuning | \n",
"
\n",
" \n",
"