SOA Exam SRM (Statistics for Risk Modeling) Glossary

30 essential terms and definitions for SOA Exam SRM (Statistics for Risk Modeling). Each definition is written for exam preparation, covering the concepts as they are tested on the 2026 syllabus.

30 Terms
15 Sections
2026 Syllabus

A

AIC (Akaike Information Criterion)
Akaike Information Criterion is a model selection metric that balances goodness of fit against model complexity by penalizing the number of estimated parameters, with lower values indicating a more parsimonious model.AIC=2ln(L)+2k\text{AIC} = -2\ln(L) + 2k
ARIMA
ARIMA (AutoRegressive Integrated Moving Average) is a time series forecasting model that combines autoregressive terms, differencing for stationarity, and moving average terms, specified by orders (p, d, q).
Autocorrelation
Autocorrelation is the correlation of a time series with a lagged version of itself, measured by the autocorrelation function (ACF), used to identify temporal patterns and determine the order of time series models.

B

Bagging
Bagging (bootstrap aggregating) is an ensemble method that trains multiple models on random bootstrap samples of the training data and averages their predictions, reducing variance and improving stability compared to a single model.
Bias-Variance Tradeoff
Bias-variance tradeoff is the fundamental tension in predictive modeling: increasing model complexity reduces bias (systematic error) but increases variance (sensitivity to training data), with the optimal model minimizing total expected prediction error.
BIC (Bayesian Information Criterion)
Bayesian Information Criterion is a model selection criterion similar to AIC but with a stronger penalty for model complexity, tending to select simpler models, especially as sample size increases.BIC=2ln(L)+kln(n)\text{BIC} = -2\ln(L) + k\ln(n)

C

Classification Tree
Classification tree is a decision tree used for categorical response variables, recursively splitting the feature space into regions and assigning each region to the most frequent class, with splits chosen to maximize purity (minimize Gini impurity or entropy).
Clustering
Clustering is an unsupervised learning technique that groups observations into clusters such that observations within a cluster are more similar to each other than to those in other clusters, with common methods including k-means and hierarchical clustering.
Confusion Matrix
Confusion matrix is a table that summarizes the performance of a classification model by displaying the counts of true positives, true negatives, false positives, and false negatives for each class.
Cross-Validation
Cross-validation is a resampling technique that partitions data into complementary training and validation sets across multiple iterations (such as k-fold) to estimate out-of-sample prediction error and guard against overfitting.

D

Decision Tree
Decision tree is a nonparametric supervised learning method that recursively partitions the feature space using binary splits based on predictor variables, producing an interpretable tree structure for classification or regression.

E

Elastic Net
Elastic net is a regularization method that combines the L1 penalty of LASSO and the L2 penalty of ridge regression, controlled by a mixing parameter, useful when predictors are correlated and variable selection is desired.min(yiy^i)2+λ1βj+λ2βj2\min \sum (y_i - \hat{y}_i)^2 + \lambda_1 \sum |\beta_j| + \lambda_2 \sum \beta_j^2

G

Generalized Linear Model
Generalized linear model (GLM) extends ordinary linear regression by allowing the response variable to follow any distribution in the exponential family and linking the mean to the linear predictor through a link function, accommodating count, binary, and continuous positive outcomes.
Gini Impurity
Gini impurity is a measure of node purity in classification trees, calculated as the probability of incorrectly classifying a randomly chosen element if it were labeled according to the distribution of classes in the node.G=1k=1Kpk2G = 1 - \sum_{k=1}^{K} p_k^2

K

K-Means Clustering
K-means clustering is a partitional clustering algorithm that assigns each observation to the cluster with the nearest centroid, iteratively updating centroids and reassigning observations to minimize total within-cluster sum of squares.

L

LASSO Regression
LASSO (Least Absolute Shrinkage and Selection Operator) regression adds an L1 penalty to the ordinary least squares objective, shrinking some coefficients exactly to zero and thus performing variable selection.min(yiy^i)2+λβj\min \sum (y_i - \hat{y}_i)^2 + \lambda \sum |\beta_j|
Linear Regression
Linear regression models the relationship between a continuous response variable and one or more predictor variables by fitting a linear equation to the observed data, estimating coefficients that minimize the sum of squared residuals.y=β0+β1x1++βpxp+εy = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p + \varepsilon
Logistic Regression
Logistic regression models the probability of a binary outcome as a function of predictor variables using the logit link function, estimating coefficients via maximum likelihood.ln ⁣(p1p)=β0+β1x1++βpxp\ln\!\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p

M

Mean Squared Error
Mean squared error is the average of the squared differences between predicted and observed values, serving as a standard loss function for regression models that penalizes larger errors more heavily.MSE=1ni=1n(yiy^i)2\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
Multicollinearity
Multicollinearity occurs when two or more predictor variables in a regression model are highly correlated, inflating the variance of coefficient estimates and making individual predictor effects difficult to interpret. Diagnosed using the variance inflation factor (VIF).

O

Overfitting
Overfitting occurs when a model learns the noise in the training data rather than the underlying pattern, resulting in excellent training performance but poor generalization to new data.

P

Principal Component Analysis
Principal component analysis (PCA) is a dimensionality reduction technique that transforms correlated variables into a smaller set of uncorrelated principal components, ordered by the amount of variance they explain.

R

R-Squared
R-squared (coefficient of determination) measures the proportion of variance in the response variable explained by the model, ranging from 0 to 1, with higher values indicating better fit.R2=1SSresSStotR^2 = 1 - \frac{\text{SS}_{\text{res}}}{\text{SS}_{\text{tot}}}
Random Forest
Random forest is an ensemble method that builds many decision trees on bootstrap samples of the data, using a random subset of features at each split, and averages (regression) or votes (classification) across trees to improve prediction accuracy and reduce overfitting.
Regularization
Regularization is a technique that adds a penalty term to the model's loss function to constrain the size of the coefficients, reducing overfitting by trading a small increase in bias for a larger decrease in variance.
Ridge Regression
Ridge regression adds an L2 penalty (proportional to the sum of squared coefficients) to the ordinary least squares objective, shrinking coefficients toward zero without eliminating them, effective when predictors are multicollinear.min(yiy^i)2+λβj2\min \sum (y_i - \hat{y}_i)^2 + \lambda \sum \beta_j^2
ROC Curve
ROC (Receiver Operating Characteristic) curve is a plot of the true positive rate against the false positive rate at various classification thresholds, with the area under the curve (AUC) summarizing the model's ability to discriminate between classes.

S

Stationarity
Stationarity is a property of a time series whose statistical properties (mean, variance, autocorrelation) do not change over time. Many time series models require stationarity, achieved through differencing or transformation.

T

Time Series
Time series is a sequence of data points collected at successive, equally spaced points in time, analyzed to identify trends, seasonal patterns, and autocorrelation structure for forecasting.

V

Variance Inflation Factor
Variance inflation factor (VIF) quantifies the degree of multicollinearity for each predictor in a regression model, calculated as the reciprocal of one minus the R-squared from regressing that predictor on all other predictors.VIFj=11Rj2\text{VIF}_j = \frac{1}{1 - R_j^2}
Practice Exam SRM Questions →

About FreeFellow

FreeFellow is an AI-native exam prep platform for actuarial (SOA & CAS), CFA, CFP, CPA, CAIA, and securities licensing candidates — built around modern AI as a core capability rather than as a bolt-on. Every lesson ships with AI-narrated audio. Every constructed-response item has a copy-to-AI prompt builder so candidates can paste their answer into their own ChatGPT or Claude for self-graded feedback. Fellow members get instant AI grading on essays against the official rubric (currently CFA Level III, expanding to other essay-bearing sections).

The 70% you need to pass — question bank, written solutions, lessons, formula sheet, mixed practice, readiness tracking — is free forever, with no trial period and no credit card. Become a Fellow ($59/quarter or $149/year per track) to unlock mock exams, flashcards with spaced repetition, performance analytics, AI essay grading, and a personalized study plan.