Classical Prediction Models with Scikit-learn MCQ Questions with Answers – Page 2 (Latest 2026)

Practice Classical Prediction Models with Scikit-learn MCQ questions with detailed explanations and clear answer validation. These MCQs help you revise core concepts, compare close options, and improve accuracy for interviews, certification exams, and technical screening rounds. Use this updated 2026 set to strengthen fundamentals and confidence.

Related mcq: Agentic AI Basics MCQ | RAG Basics MCQ | Agentic AI Advanced MCQ | Agentic Evaluation Guardrails MCQ | Agentic Human In The Loop MCQ

Q51. class_weight='balanced' in classifiers helps by:

Select an answer to check.

Answer: Adjusting class weights inversely to frequency

Here, Adjusting class weights inversely to frequency is the right choice. It gives minority classes more influence during training. It aligns directly with what the question asks about class_weight='balanced' in classifiers helps by:. Competing choices sound plausible, but they miss the key condition.

Q52. Which is a common way to address class imbalance?

Select an answer to check.

Answer: Resampling or class weighting

In this case, Resampling or class weighting is correct. Balancing methods improve minority detection. It aligns directly with what the question asks about which is a common way to address class. Competing choices sound plausible, but they miss the key condition.

Q53. SMOTE is available in:

Select an answer to check.

Answer: imbalanced-learn package

The best option here is imbalanced-learn package. SMOTE is provided by imbalanced-learn, often used with sklearn pipelines. It aligns directly with what the question asks about smote is available in:. Competing choices sound plausible, but they miss the key condition.

Q54. PCA is primarily used for:

Select an answer to check.

Answer: Dimensionality reduction

For this question, Dimensionality reduction is correct. PCA projects features onto lower-dimensional orthogonal components. It aligns directly with what the question asks about pca is primarily used for:. Competing choices sound plausible, but they miss the key condition.

Q55. Which PCA attribute shows variance explained per component?

Select an answer to check.

Answer: explained_variance_ratio_

explained_variance_ratio_ is the correct answer here. It indicates information retained by each principal component. It aligns directly with what the question asks about which pca attribute shows variance explained per component. Competing choices sound plausible, but they miss the key condition.

Q56. Before PCA, scaling numeric features is usually:

Select an answer to check.

Answer: Recommended

Here, Recommended is the right choice. PCA is scale-sensitive; large-scale features can dominate components. This matches the core idea being tested around before pca, scaling numeric features is usually:. Competing choices sound plausible, but they miss the key condition.

Q57. SelectKBest is used for:

Select an answer to check.

Answer: Feature selection

In this case, Feature selection is correct. It selects top k features by a scoring function. This matches the core idea being tested around selectkbest is used for:. Competing choices sound plausible, but they miss the key condition.

Q58. mutual_info_classif estimates:

Select an answer to check.

Answer: Dependency between features and class

The best option here is Dependency between features and class. Mutual information captures nonlinear associations too. This matches the core idea being tested around mutual_info_classif estimates:. Competing choices sound plausible, but they miss the key condition.

Q59. VarianceThreshold removes features with:

Select an answer to check.

Answer: Low variance below threshold

For this question, Low variance below threshold is correct. Near-constant features often contribute little predictive power. This matches the core idea being tested around variancethreshold removes features with:. Competing choices sound plausible, but they miss the key condition.

Q60. Data leakage happens when:

Select an answer to check.

Answer: Information from validation/test influences training

Information from validation/test influences training is the correct answer here. Leakage inflates metrics and harms real-world performance. This matches the core idea being tested around data leakage happens when:. Competing choices sound plausible, but they miss the key condition.

Q61. Why fit preprocessors on training data only?

Select an answer to check.

Answer: To avoid leakage

Here, To avoid leakage is the right choice. Validation/test statistics must not affect fitted transforms. That is exactly the concept behind why fit preprocessors on training data only in this context. Competing choices sound plausible, but they miss the key condition.

Q62. Pipelines help prevent leakage because they:

Select an answer to check.

Answer: Apply fit/transform within each training fold correctly

In this case, Apply fit/transform within each training fold correctly is correct. Pipeline integrates preprocessing with estimator during CV. That is exactly the concept behind pipelines help prevent leakage because they: in this context. Competing choices sound plausible, but they miss the key condition.

Q63. SimpleImputer(strategy='mean') is used to:

Select an answer to check.

Answer: Fill missing numeric values with mean

The best option here is Fill missing numeric values with mean. Mean imputation is a basic approach for continuous features. That is exactly the concept behind simpleimputer(strategy='mean') is used to: in this context. Competing choices sound plausible, but they miss the key condition.

Q64. For skewed numeric data, which imputation may be more robust?

Select an answer to check.

Answer: Median

For this question, Median is correct. Median is less affected by extreme outliers. That is exactly the concept behind for skewed numeric data, which imputation may be in this context. Competing choices sound plausible, but they miss the key condition.

Q65. For categorical missing values, a common SimpleImputer strategy is:

Select an answer to check.

Answer: most_frequent

most_frequent is the correct answer here. Most frequent category is often used for missing categoricals. That is exactly the concept behind for categorical missing values, a common simpleimputer strategy in this context. Competing choices sound plausible, but they miss the key condition.

Q66. IterativeImputer in sklearn is:

Select an answer to check.

Answer: An experimental multivariate imputer

Here, An experimental multivariate imputer is the right choice. It models each feature with missing values as function of others. It fits the requirement in the prompt about iterativeimputer in sklearn is:. Competing choices sound plausible, but they miss the key condition.

Q67. Which scaler is generally robust to outliers?

Select an answer to check.

Answer: RobustScaler

In this case, RobustScaler is correct. RobustScaler uses median and IQR, reducing outlier impact. It fits the requirement in the prompt about which scaler is generally robust to outliers. Competing choices sound plausible, but they miss the key condition.

Q68. RobustScaler centers/scales using:

Select an answer to check.

Answer: Median and IQR

The best option here is Median and IQR. This makes transformed values less sensitive to extremes. It fits the requirement in the prompt about robustscaler centers/scales using:. Competing choices sound plausible, but they miss the key condition.

Q69. PolynomialFeatures helps linear models by:

Select an answer to check.

Answer: Adding interaction/nonlinear terms

For this question, Adding interaction/nonlinear terms is correct. Expanded feature space can model nonlinear relationships. It fits the requirement in the prompt about polynomialfeatures helps linear models by:. Competing choices sound plausible, but they miss the key condition.

Q70. A sign of overfitting is:

Select an answer to check.

Answer: Low train error, high test error

Low train error, high test error is the correct answer here. The model memorizes training patterns but generalizes poorly. It fits the requirement in the prompt about a sign of overfitting is:. Competing choices sound plausible, but they miss the key condition.

Q71. A sign of underfitting is:

Select an answer to check.

Answer: High train and high test error

Here, High train and high test error is the right choice. Model capacity is too low to capture structure. This is the most accurate statement for a sign of underfitting is:. Competing choices sound plausible, but they miss the key condition.

Q72. Regularization is mainly used to:

Select an answer to check.

Answer: Reduce overfitting

In this case, Reduce overfitting is correct. Penalties constrain model complexity for better generalization. This is the most accurate statement for regularization is mainly used to:. Competing choices sound plausible, but they miss the key condition.

Q73. Bias-variance tradeoff describes balance between:

Select an answer to check.

Answer: Underfitting and overfitting

The best option here is Underfitting and overfitting. Higher bias underfits; higher variance overfits. This is the most accurate statement for bias-variance tradeoff describes balance between:. Competing choices sound plausible, but they miss the key condition.

Q74. A learning curve plots performance against:

Select an answer to check.

Answer: Number of training samples

For this question, Number of training samples is correct. It helps diagnose bias/variance and data sufficiency. This is the most accurate statement for a learning curve plots performance against:. Competing choices sound plausible, but they miss the key condition.

Q75. A validation curve typically varies:

Select an answer to check.

Answer: One hyperparameter value

One hyperparameter value is the correct answer here. It shows train/validation score trends across parameter settings. This is the most accurate statement for a validation curve typically varies:. Competing choices sound plausible, but they miss the key condition.

Q76. Which estimator supports early stopping via incremental optimization options?

Select an answer to check.

Answer: SGDClassifier

Here, SGDClassifier is the right choice. SGD-based estimators can use validation-based stopping controls. It aligns directly with what the question asks about which estimator supports early stopping via incremental optimization. The remaining choices fail because they don’t satisfy the full definition.

Q77. For logistic behavior in SGDClassifier, commonly use loss=

Select an answer to check.

Answer: log_loss

In this case, log_loss is correct. Log_loss corresponds to logistic regression objective. It aligns directly with what the question asks about for logistic behavior in sgdclassifier, commonly use loss=. The remaining choices fail because they don’t satisfy the full definition.

Q78. Perceptron in sklearn is best described as:

Select an answer to check.

Answer: Linear classifier trained with perceptron rule

The best option here is Linear classifier trained with perceptron rule. Perceptron is an online linear classification algorithm. It aligns directly with what the question asks about perceptron in sklearn is best described as:. The remaining choices fail because they don’t satisfy the full definition.

Q79. PassiveAggressiveClassifier is useful for:

Select an answer to check.

Answer: Online large-scale learning

For this question, Online large-scale learning is correct. It supports fast incremental updates on streaming data. It aligns directly with what the question asks about passiveaggressiveclassifier is useful for:. The remaining choices fail because they don’t satisfy the full definition.

Q80. partial_fit enables:

Select an answer to check.

Answer: Incremental learning on mini-batches

Incremental learning on mini-batches is the correct answer here. Partial_fit updates model without full retraining. It aligns directly with what the question asks about partial_fit enables:. The remaining choices fail because they don’t satisfy the full definition.

Q81. Setting warm_start=True generally allows:

Select an answer to check.

Answer: Reusing previous fitted state for further training

Here, Reusing previous fitted state for further training is the right choice. Warm_start can continue from existing model parameters. This matches the core idea being tested around setting warm_start=true generally allows:. The remaining choices fail because they don’t satisfy the full definition.

Q82. sklearn.base.clone() is used to:

Select an answer to check.

Answer: Create new estimator with same hyperparameters only

In this case, Create new estimator with same hyperparameters only is correct. Clone resets learned attributes while preserving init params. This matches the core idea being tested around sklearn.base.clone() is used to:. The remaining choices fail because they don’t satisfy the full definition.

Q83. How do you set nested pipeline parameters?

Select an answer to check.

Answer: Using step__param syntax

The best option here is Using step__param syntax. Example: model__C in a pipeline named model. This matches the core idea being tested around how do you set nested pipeline parameters. The remaining choices fail because they don’t satisfy the full definition.

Q84. get_params(deep=True) returns:

Select an answer to check.

Answer: Estimator parameters including nested ones

For this question, Estimator parameters including nested ones is correct. Deep mode includes sub-estimator parameters in composites. This matches the core idea being tested around get_params(deep=true) returns:. The remaining choices fail because they don’t satisfy the full definition.

Q85. For model persistence in sklearn workflows, common tools are:

Select an answer to check.

Answer: joblib or pickle

joblib or pickle is the correct answer here. Joblib/pickle can save and load fitted estimators. This matches the core idea being tested around for model persistence in sklearn workflows, common tools. The remaining choices fail because they don’t satisfy the full definition.

Q86. Tree-based ensembles often expose global feature importance via:

Select an answer to check.

Answer: feature_importances_

Here, feature_importances_ is the right choice. Impurity-based importances are available in many tree models. That is exactly the concept behind tree-based ensembles often expose global feature importance via: in this context. The remaining choices fail because they don’t satisfy the full definition.

Q87. Permutation importance is useful because it is:

Select an answer to check.

Answer: Model-agnostic

In this case, Model-agnostic is correct. It measures performance drop when a feature is shuffled. That is exactly the concept behind permutation importance is useful because it is: in this context. The remaining choices fail because they don’t satisfy the full definition.

Q88. CalibratedClassifierCV is used to:

Select an answer to check.

Answer: Calibrate predicted probabilities

The best option here is Calibrate predicted probabilities. Calibration aligns predicted probabilities with observed frequencies. That is exactly the concept behind calibratedclassifiercv is used to: in this context. The remaining choices fail because they don’t satisfy the full definition.

Q89. Brier score evaluates:

Select an answer to check.

Answer: Probability prediction quality

For this question, Probability prediction quality is correct. Lower Brier score indicates better calibrated probabilistic forecasts. That is exactly the concept behind brier score evaluates: in this context. The remaining choices fail because they don’t satisfy the full definition.

Q90. To adapt single-output estimators for multiple targets, use:

Select an answer to check.

Answer: MultiOutputRegressor / MultiOutputClassifier

MultiOutputRegressor / MultiOutputClassifier is the correct answer here. These wrappers train one estimator per target output. That is exactly the concept behind to adapt single-output estimators for multiple targets, use: in this context. The remaining choices fail because they don’t satisfy the full definition.

Q91. OneVsRestClassifier strategy trains:

Select an answer to check.

Answer: One binary model per class vs all others

Here, One binary model per class vs all others is the right choice. OvR decomposes multiclass into multiple binary problems. It fits the requirement in the prompt about onevsrestclassifier strategy trains:. The remaining choices fail because they don’t satisfy the full definition.

Q92. OneVsOneClassifier strategy trains roughly:

Select an answer to check.

Answer: k*(k-1)/2 pairwise models

In this case, k*(k-1)/2 pairwise models is correct. OvO fits a classifier for each pair of classes. It fits the requirement in the prompt about onevsoneclassifier strategy trains roughly:. The remaining choices fail because they don’t satisfy the full definition.

Q93. For target labels in multiclass problems, a common utility is:

Select an answer to check.

Answer: LabelBinarizer

The best option here is LabelBinarizer. LabelBinarizer converts class labels to binary indicator format. It fits the requirement in the prompt about for target labels in multiclass problems, a common. The remaining choices fail because they don’t satisfy the full definition.

Q94. R² score in regression can be:

Select an answer to check.

Answer: Negative on poor models

For this question, Negative on poor models is correct. R² below zero means model is worse than predicting mean target. It fits the requirement in the prompt about r² score in regression can be:. The remaining choices fail because they don’t satisfy the full definition.

Q95. MAE compared to MSE is generally:

Select an answer to check.

Answer: Less sensitive to outliers

Less sensitive to outliers is the correct answer here. Absolute error penalizes large deviations less aggressively than squared error. It fits the requirement in the prompt about mae compared to mse is generally:. The remaining choices fail because they don’t satisfy the full definition.

Q96. RMSE emphasizes large errors because it:

Select an answer to check.

Answer: Squares residuals before root

Here, Squares residuals before root is the right choice. Squaring magnifies larger errors relative to smaller ones. This is the most accurate statement for rmse emphasizes large errors because it:. The remaining choices fail because they don’t satisfy the full definition.

Q97. Median absolute error is especially robust when:

Select an answer to check.

Answer: Data has outliers

In this case, Data has outliers is correct. Median-based metrics are resistant to extreme values. This is the most accurate statement for median absolute error is especially robust when:. The remaining choices fail because they don’t satisfy the full definition.

Q98. For time-ordered data, preferred cross-validation splitter is:

Select an answer to check.

Answer: TimeSeriesSplit

The best option here is TimeSeriesSplit. It respects temporal order and avoids future-to-past leakage. This is the most accurate statement for for time-ordered data, preferred cross-validation splitter is:. The remaining choices fail because they don’t satisfy the full definition.

Q99. When splitting time series data, shuffle should usually be:

Select an answer to check.

Answer: False

For this question, False is correct. Shuffling can leak future information into training. This is the most accurate statement for when splitting time series data, shuffle should usually. The remaining choices fail because they don’t satisfy the full definition.

Q100. What improves reproducibility besides random_state?

Select an answer to check.

Answer: Pinning library versions and data snapshots

Pinning library versions and data snapshots is the correct answer here. Version and data control helps make experiments repeatable. This is the most accurate statement for what improves reproducibility besides random_state. The remaining choices fail because they don’t satisfy the full definition.