Question 1

What type of task predicts a continuous numeric value?

Accepted Answer

Regression. Here, Regression is the right choice. Regression models estimate continuous targets such as price or demand. It aligns directly with what the question asks about what type of task predicts a continuous numeric. A quick elimination of partially true options helps confirm it.

Question 2

What type of task predicts a category label?

Accepted Answer

Classification. In this case, Classification is correct. Classification assigns inputs to predefined classes. It aligns directly with what the question asks about what type of task predicts a category label. A quick elimination of partially true options helps confirm it.

Question 3

In scikit-learn, what does fit() do?

Accepted Answer

Trains the model on data. The best option here is Trains the model on data. Fit() learns parameters from training features and targets. It aligns directly with what the question asks about in scikit-learn, what does fit() do. A quick elimination of partially true options helps confirm it.

Question 4

In scikit-learn, what does predict() do?

Accepted Answer

Returns model outputs for new inputs. For this question, Returns model outputs for new inputs is correct. Predict() uses learned parameters to infer target values or classes. It aligns directly with what the question asks about in scikit-learn, what does predict() do. A quick elimination of partially true options helps confirm it.

Question 5

Which utility is commonly used to split data into train and test sets?

Accepted Answer

train_test_split. train_test_split is the correct answer here. Train_test_split creates holdout subsets for evaluation. It aligns directly with what the question asks about which utility is commonly used to split data. A quick elimination of partially true options helps confirm it.

Question 6

Why set random_state in experiments?

Accepted Answer

For reproducibility. Here, For reproducibility is the right choice. Random_state makes random operations deterministic across runs. This matches the core idea being tested around why set random_state in experiments. A quick elimination of partially true options helps confirm it.

Question 7

Which scaler standardizes features to zero mean and unit variance?

Accepted Answer

StandardScaler. In this case, StandardScaler is correct. StandardScaler transforms each feature to mean 0 and variance 1. This matches the core idea being tested around which scaler standardizes features to zero mean and. A quick elimination of partially true options helps confirm it.

Question 8

Which scaler maps values to a fixed range like [0,1]?

Accepted Answer

MinMaxScaler. The best option here is MinMaxScaler. MinMaxScaler rescales features based on observed min and max. This matches the core idea being tested around which scaler maps values to a fixed range. A quick elimination of partially true options helps confirm it.

Question 9

How should nominal categorical features usually be encoded?

Accepted Answer

One-hot encoding. For this question, One-hot encoding is correct. One-hot encoding avoids implying false order among categories. This matches the core idea being tested around how should nominal categorical features usually be encoded. A quick elimination of partially true options helps confirm it.

Question 10

Why is LabelEncoder usually not ideal for input categorical columns?

Accepted Answer

It imposes ordinal meaning. It imposes ordinal meaning is the correct answer here. Integer labels can suggest artificial distance/order for categories. This matches the core idea being tested around why is labelencoder usually not ideal for input. A quick elimination of partially true options helps confirm it.

Question 11

What is the main benefit of Pipeline in scikit-learn?

Accepted Answer

Combines preprocessing and model steps safely. Here, Combines preprocessing and model steps safely is the right choice. Pipelines make workflows cleaner and help prevent data leakage. That is exactly the concept behind what is the main benefit of pipeline in in this context. A quick elimination of partially true options helps confirm it.

Question 12

Which tool applies different preprocessing to different column groups?

Accepted Answer

ColumnTransformer. In this case, ColumnTransformer is correct. ColumnTransformer routes specified columns through separate transforms. That is exactly the concept behind which tool applies different preprocessing to different column in this context. A quick elimination of partially true options helps confirm it.

Question 13

LinearRegression minimizes which quantity by default?

Accepted Answer

Residual sum of squares. The best option here is Residual sum of squares. Ordinary least squares fits coefficients minimizing squared residuals. That is exactly the concept behind linearregression minimizes which quantity by default in this context. A quick elimination of partially true options helps confirm it.

Question 14

Ridge regression adds what type of penalty?

Accepted Answer

L2 penalty. For this question, L2 penalty is correct. Ridge uses L2 regularization to shrink coefficients. That is exactly the concept behind ridge regression adds what type of penalty in this context. A quick elimination of partially true options helps confirm it.

Question 15

Lasso regression adds what type of penalty?

Accepted Answer

L1 penalty. L1 penalty is the correct answer here. Lasso can drive some coefficients exactly to zero for sparsity. That is exactly the concept behind lasso regression adds what type of penalty in this context. A quick elimination of partially true options helps confirm it.

Question 16

ElasticNet combines which penalties?

Accepted Answer

L1 and L2. Here, L1 and L2 is the right choice. ElasticNet balances variable selection and shrinkage. It fits the requirement in the prompt about elasticnet combines which penalties. A quick elimination of partially true options helps confirm it.

Question 17

LogisticRegression in scikit-learn is mainly used for:

Accepted Answer

Classification. In this case, Classification is correct. Despite its name, logistic regression is a classification algorithm. It fits the requirement in the prompt about logisticregression in scikit-learn is mainly used for:. A quick elimination of partially true options helps confirm it.

Question 18

Binary logistic regression models class probability using:

Accepted Answer

Sigmoid function. The best option here is Sigmoid function. The sigmoid maps linear scores to probabilities in [0,1]. It fits the requirement in the prompt about binary logistic regression models class probability using:. A quick elimination of partially true options helps confirm it.

Question 19

For binary classification, a common default probability threshold is:

Accepted Answer

0.5. For this question, 0.5 is correct. Many workflows use 0.5 by default, though threshold tuning is common. It fits the requirement in the prompt about for binary classification, a common default probability threshold. A quick elimination of partially true options helps confirm it.

Question 20

For multiclass logistic regression, which strategy can be used?

Accepted Answer

Multinomial. Multinomial is the correct answer here. Multinomial logistic handles multiple classes directly. It fits the requirement in the prompt about for multiclass logistic regression, which strategy can be. A quick elimination of partially true options helps confirm it.

Question 21

In KNeighborsClassifier, which hyperparameter controls neighborhood size?

Accepted Answer

n_neighbors. Here, n_neighbors is the right choice. N_neighbors defines how many nearby points vote. This is the most accurate statement for in kneighborsclassifier, which hyperparameter controls neighborhood size. A quick elimination of partially true options helps confirm it.

Question 22

Increasing k in KNN often causes:

Accepted Answer

Higher bias, lower variance. In this case, Higher bias, lower variance is correct. Larger neighborhoods smooth decision boundaries. This is the most accurate statement for increasing k in knn often causes:. A quick elimination of partially true options helps confirm it.

Question 23

Which hyperparameter limits tree growth depth?

Accepted Answer

max_depth. The best option here is max_depth. Max_depth constrains complexity and can reduce overfitting. This is the most accurate statement for which hyperparameter limits tree growth depth. A quick elimination of partially true options helps confirm it.

Question 24

A common split criterion in decision trees for classification is:

Accepted Answer

Gini impurity. For this question, Gini impurity is correct. Trees choose splits that reduce impurity like Gini or entropy. This is the most accurate statement for a common split criterion in decision trees for. A quick elimination of partially true options helps confirm it.

Question 25

RandomForest is primarily based on:

Accepted Answer

Bagging many trees. Bagging many trees is the correct answer here. Random forests average many decorrelated trees for robustness. This is the most accurate statement for randomforest is primarily based on:. A quick elimination of partially true options helps confirm it.

Question 26

ExtraTrees increases randomness by:

Accepted Answer

Choosing random split thresholds. Here, Choosing random split thresholds is the right choice. ExtraTrees samples splits more randomly than standard random forest. It aligns directly with what the question asks about extratrees increases randomness by:. The other options are either incomplete or contextually incorrect.

Question 27

GradientBoosting builds trees:

Accepted Answer

Sequentially to correct errors. In this case, Sequentially to correct errors is correct. Boosting adds weak learners that fit residual errors. It aligns directly with what the question asks about gradientboosting builds trees:. The other options are either incomplete or contextually incorrect.

Question 28

AdaBoost focuses learning by:

Accepted Answer

Reweighting misclassified samples. The best option here is Reweighting misclassified samples. AdaBoost increases focus on previously misclassified points. It aligns directly with what the question asks about adaboost focuses learning by:. The other options are either incomplete or contextually incorrect.

Question 29

Which statement about XGBoost and scikit-learn is correct?

Accepted Answer

XGBoost is an external library with sklearn-style API. For this question, XGBoost is an external library with sklearn-style API is correct. XGBoost is separate but interoperates with sklearn patterns. It aligns directly with what the question asks about which statement about xgboost and scikit-learn is correct. The other options are either incomplete or contextually incorrect.

Question 30

In SVM, larger C generally means:

Accepted Answer

Weaker regularization and stricter fitting. Weaker regularization and stricter fitting is the correct answer here. High C penalizes misclassification more, fitting training data more closely. It aligns directly with what the question asks about in svm, larger c generally means:. The other options are either incomplete or contextually incorrect.

Question 31

Kernel trick in SVM enables:

Accepted Answer

Nonlinear decision boundaries in transformed space. Here, Nonlinear decision boundaries in transformed space is the right choice. Kernels compute similarities without explicit high-dimensional mapping. This matches the core idea being tested around kernel trick in svm enables:. The other options are either incomplete or contextually incorrect.

Question 32

In RBF SVM, gamma controls:

Accepted Answer

Influence radius of training points. In this case, Influence radius of training points is correct. Higher gamma means narrower influence and potentially more complex boundaries. This matches the core idea being tested around in rbf svm, gamma controls:. The other options are either incomplete or contextually incorrect.

Question 33

Naive Bayes assumes features are:

Accepted Answer

Conditionally independent given class. The best option here is Conditionally independent given class. This simplifying assumption enables fast probabilistic classification. This matches the core idea being tested around naive bayes assumes features are:. The other options are either incomplete or contextually incorrect.

Question 34

GaussianNB is appropriate when features are roughly:

Accepted Answer

Normally distributed continuous values. For this question, Normally distributed continuous values is correct. GaussianNB models each feature likelihood with Gaussian distributions. This matches the core idea being tested around gaussiannb is appropriate when features are roughly:. The other options are either incomplete or contextually incorrect.

Question 35

MultinomialNB is commonly used for:

Accepted Answer

Count-based text features. Count-based text features is the correct answer here. It works well with term-frequency style nonnegative counts. This matches the core idea being tested around multinomialnb is commonly used for:. The other options are either incomplete or contextually incorrect.

Question 36

BernoulliNB is designed for:

Accepted Answer

Binary/boolean features. Here, Binary/boolean features is the right choice. BernoulliNB models feature presence/absence. That is exactly the concept behind bernoullinb is designed for: in this context. The other options are either incomplete or contextually incorrect.

Question 37

What does a confusion matrix summarize?

Accepted Answer

Prediction vs actual class counts. In this case, Prediction vs actual class counts is correct. It shows true/false positives and negatives by class. That is exactly the concept behind what does a confusion matrix summarize in this context. The other options are either incomplete or contextually incorrect.

Question 38

Why can accuracy be misleading on imbalanced data?

Accepted Answer

Majority class can dominate score. The best option here is Majority class can dominate score. High accuracy can hide poor minority-class detection. That is exactly the concept behind why can accuracy be misleading on imbalanced data in this context. The other options are either incomplete or contextually incorrect.

Question 39

Precision is defined as:

Accepted Answer

TP/(TP+FP). For this question, TP/(TP+FP) is correct. Precision measures correctness among predicted positives. That is exactly the concept behind precision is defined as: in this context. The other options are either incomplete or contextually incorrect.

Question 40

Recall is defined as:

Accepted Answer

TP/(TP+FN). TP/(TP+FN) is the correct answer here. Recall measures how many actual positives were found. That is exactly the concept behind recall is defined as: in this context. The other options are either incomplete or contextually incorrect.

Question 41

F1-score is the:

Accepted Answer

Harmonic mean of precision and recall. Here, Harmonic mean of precision and recall is the right choice. F1 balances precision and recall in one metric. It fits the requirement in the prompt about f1-score is the:. The other options are either incomplete or contextually incorrect.

Question 42

ROC-AUC mainly evaluates:

Accepted Answer

Ranking quality across thresholds. In this case, Ranking quality across thresholds is correct. AUC summarizes true-positive vs false-positive tradeoff. It fits the requirement in the prompt about roc-auc mainly evaluates:. The other options are either incomplete or contextually incorrect.

Question 43

For heavy class imbalance, which curve is often more informative?

Accepted Answer

Precision-Recall curve. The best option here is Precision-Recall curve. PR emphasizes positive class performance under imbalance. It fits the requirement in the prompt about for heavy class imbalance, which curve is often. The other options are either incomplete or contextually incorrect.

Question 44

cross_val_score is used to:

Accepted Answer

Evaluate model across folds. For this question, Evaluate model across folds is correct. It returns validation scores from repeated train/validation splits. It fits the requirement in the prompt about cross_val_score is used to:. The other options are either incomplete or contextually incorrect.

Question 45

KFold cross-validation splits data into:

Accepted Answer

k train-test partitions. k train-test partitions is the correct answer here. Each fold acts once as validation while others train. It fits the requirement in the prompt about kfold cross-validation splits data into:. The other options are either incomplete or contextually incorrect.

Question 46

StratifiedKFold is useful because it:

Accepted Answer

Preserves class proportions in each fold. Here, Preserves class proportions in each fold is the right choice. It keeps target distribution stable in classification folds. This is the most accurate statement for stratifiedkfold is useful because it:. The other options are either incomplete or contextually incorrect.

Question 47

GridSearchCV performs:

Accepted Answer

Exhaustive search over given param grid. In this case, Exhaustive search over given param grid is correct. It evaluates all specified hyperparameter combinations. This is the most accurate statement for gridsearchcv performs:. The other options are either incomplete or contextually incorrect.

Question 48

RandomizedSearchCV differs by:

Accepted Answer

Sampling limited random combinations. The best option here is Sampling limited random combinations. It can find good settings faster in large search spaces. This is the most accurate statement for randomizedsearchcv differs by:. The other options are either incomplete or contextually incorrect.

Question 49

Setting n_jobs=-1 generally means:

Accepted Answer

Use all available CPU cores. For this question, Use all available CPU cores is correct. Many sklearn estimators/searches parallelize with n_jobs. This is the most accurate statement for setting n_jobs=-1 generally means:. The other options are either incomplete or contextually incorrect.

Question 50

The scoring parameter in model selection controls:

Accepted Answer

Optimization metric for selecting best model. Optimization metric for selecting best model is the correct answer here. Best hyperparameters are chosen based on scoring metric. This is the most accurate statement for the scoring parameter in model selection controls:. The other options are either incomplete or contextually incorrect.

Classical Prediction Models with Scikit-learn MCQ Questions with Answers (Latest 2026)

Q1. What type of task predicts a continuous numeric value?

Q2. What type of task predicts a category label?

Q3. In scikit-learn, what does fit() do?

Q4. In scikit-learn, what does predict() do?

Q5. Which utility is commonly used to split data into train and test sets?

Q6. Why set random_state in experiments?

Q7. Which scaler standardizes features to zero mean and unit variance?

Q8. Which scaler maps values to a fixed range like [0,1]?

Q9. How should nominal categorical features usually be encoded?

Q10. Why is LabelEncoder usually not ideal for input categorical columns?

Q11. What is the main benefit of Pipeline in scikit-learn?

Q12. Which tool applies different preprocessing to different column groups?

Q13. LinearRegression minimizes which quantity by default?

Q14. Ridge regression adds what type of penalty?

Q15. Lasso regression adds what type of penalty?

Q16. ElasticNet combines which penalties?

Q17. LogisticRegression in scikit-learn is mainly used for:

Q18. Binary logistic regression models class probability using:

Q19. For binary classification, a common default probability threshold is:

Q20. For multiclass logistic regression, which strategy can be used?

Q21. In KNeighborsClassifier, which hyperparameter controls neighborhood size?

Q22. Increasing k in KNN often causes:

Q23. Which hyperparameter limits tree growth depth?

Q24. A common split criterion in decision trees for classification is:

Q25. RandomForest is primarily based on:

Q26. ExtraTrees increases randomness by:

Q27. GradientBoosting builds trees:

Q28. AdaBoost focuses learning by:

Q29. Which statement about XGBoost and scikit-learn is correct?

Q30. In SVM, larger C generally means:

Q31. Kernel trick in SVM enables:

Q32. In RBF SVM, gamma controls:

Q33. Naive Bayes assumes features are:

Q34. GaussianNB is appropriate when features are roughly:

Q35. MultinomialNB is commonly used for:

Q36. BernoulliNB is designed for:

Q37. What does a confusion matrix summarize?

Q38. Why can accuracy be misleading on imbalanced data?

Q39. Precision is defined as:

Q40. Recall is defined as:

Q41. F1-score is the:

Q42. ROC-AUC mainly evaluates:

Q43. For heavy class imbalance, which curve is often more informative?

Q44. cross_val_score is used to:

Q45. KFold cross-validation splits data into:

Q46. StratifiedKFold is useful because it:

Q47. GridSearchCV performs:

Q48. RandomizedSearchCV differs by:

Q49. Setting n_jobs=-1 generally means:

Q50. The scoring parameter in model selection controls: