Question 1

Which option best describes a neuron?

Accepted Answer

A unit computing a weighted sum followed by a nonlinearity.. Here, A unit computing a weighted sum followed by a nonlinearity. is the right choice. Building block of neural nets. It aligns directly with what the question asks about which option best describes a neuron. A quick elimination of partially true options helps confirm it.

Question 2

What is the primary purpose of a neuron?

Accepted Answer

A unit computing a weighted sum followed by a nonlinearity.. In this case, A unit computing a weighted sum followed by a nonlinearity. is correct. Building block of neural nets. It aligns directly with what the question asks about what is the primary purpose of a neuron. A quick elimination of partially true options helps confirm it.

Question 3

Which statement about a neuron is most accurate?

Accepted Answer

A unit computing a weighted sum followed by a nonlinearity.. The best option here is A unit computing a weighted sum followed by a nonlinearity.. Building block of neural nets. It aligns directly with what the question asks about which statement about a neuron is most accurate. A quick elimination of partially true options helps confirm it.

Question 4

How is a neuron best characterized?

Accepted Answer

A unit computing a weighted sum followed by a nonlinearity.. For this question, A unit computing a weighted sum followed by a nonlinearity. is correct. Building block of neural nets. It aligns directly with what the question asks about how is a neuron best characterized. A quick elimination of partially true options helps confirm it.

Question 5

Which option best describes an activation function?

Accepted Answer

Nonlinearity applied to a neuron's pre-activation.. Nonlinearity applied to a neuron's pre-activation. is the correct answer here. Examples: ReLU, GELU, sigmoid. It aligns directly with what the question asks about which option best describes an activation function. A quick elimination of partially true options helps confirm it.

Question 6

What is the primary purpose of an activation function?

Accepted Answer

Nonlinearity applied to a neuron's pre-activation.. Here, Nonlinearity applied to a neuron's pre-activation. is the right choice. Examples: ReLU, GELU, sigmoid. This matches the core idea being tested around what is the primary purpose of an activation. A quick elimination of partially true options helps confirm it.

Question 7

Which statement about an activation function is most accurate?

Accepted Answer

Nonlinearity applied to a neuron's pre-activation.. In this case, Nonlinearity applied to a neuron's pre-activation. is correct. Examples: ReLU, GELU, sigmoid. This matches the core idea being tested around which statement about an activation function is most. A quick elimination of partially true options helps confirm it.

Question 8

How is an activation function best characterized?

Accepted Answer

Nonlinearity applied to a neuron's pre-activation.. The best option here is Nonlinearity applied to a neuron's pre-activation.. Examples: ReLU, GELU, sigmoid. This matches the core idea being tested around how is an activation function best characterized. A quick elimination of partially true options helps confirm it.

Question 9

Which option best describes ReLU?

Accepted Answer

max(0, x); piecewise linear activation.. For this question, max(0, x); piecewise linear activation. is correct. Cheap and effective default. This matches the core idea being tested around which option best describes relu. A quick elimination of partially true options helps confirm it.

Question 10

What is the primary purpose of ReLU?

Accepted Answer

max(0, x); piecewise linear activation.. max(0, x); piecewise linear activation. is the correct answer here. Cheap and effective default. This matches the core idea being tested around what is the primary purpose of relu. A quick elimination of partially true options helps confirm it.

Question 11

Which statement about ReLU is most accurate?

Accepted Answer

max(0, x); piecewise linear activation.. Here, max(0, x); piecewise linear activation. is the right choice. Cheap and effective default. That is exactly the concept behind which statement about relu is most accurate in this context. A quick elimination of partially true options helps confirm it.

Question 12

How is ReLU best characterized?

Accepted Answer

max(0, x); piecewise linear activation.. In this case, max(0, x); piecewise linear activation. is correct. Cheap and effective default. That is exactly the concept behind how is relu best characterized in this context. A quick elimination of partially true options helps confirm it.

Question 13

Which option best describes softmax?

Accepted Answer

Converts logits to a probability distribution over classes.. The best option here is Converts logits to a probability distribution over classes.. Used in multi-class classification heads. That is exactly the concept behind which option best describes softmax in this context. A quick elimination of partially true options helps confirm it.

Question 14

What is the primary purpose of softmax?

Accepted Answer

Converts logits to a probability distribution over classes.. For this question, Converts logits to a probability distribution over classes. is correct. Used in multi-class classification heads. That is exactly the concept behind what is the primary purpose of softmax in this context. A quick elimination of partially true options helps confirm it.

Question 15

Which statement about softmax is most accurate?

Accepted Answer

Converts logits to a probability distribution over classes.. Converts logits to a probability distribution over classes. is the correct answer here. Used in multi-class classification heads. That is exactly the concept behind which statement about softmax is most accurate in this context. A quick elimination of partially true options helps confirm it.

Question 16

How is softmax best characterized?

Accepted Answer

Converts logits to a probability distribution over classes.. Here, Converts logits to a probability distribution over classes. is the right choice. Used in multi-class classification heads. It fits the requirement in the prompt about how is softmax best characterized. A quick elimination of partially true options helps confirm it.

Question 17

Which option best describes backpropagation?

Accepted Answer

Algorithm to compute gradients via chain rule through the graph.. In this case, Algorithm to compute gradients via chain rule through the graph. is correct. Enables training deep nets. It fits the requirement in the prompt about which option best describes backpropagation. A quick elimination of partially true options helps confirm it.

Question 18

What is the primary purpose of backpropagation?

Accepted Answer

Algorithm to compute gradients via chain rule through the graph.. The best option here is Algorithm to compute gradients via chain rule through the graph.. Enables training deep nets. It fits the requirement in the prompt about what is the primary purpose of backpropagation. A quick elimination of partially true options helps confirm it.

Question 19

Which statement about backpropagation is most accurate?

Accepted Answer

Algorithm to compute gradients via chain rule through the graph.. For this question, Algorithm to compute gradients via chain rule through the graph. is correct. Enables training deep nets. It fits the requirement in the prompt about which statement about backpropagation is most accurate. A quick elimination of partially true options helps confirm it.

Question 20

How is backpropagation best characterized?

Accepted Answer

Algorithm to compute gradients via chain rule through the graph.. Algorithm to compute gradients via chain rule through the graph. is the correct answer here. Enables training deep nets. It fits the requirement in the prompt about how is backpropagation best characterized. A quick elimination of partially true options helps confirm it.

Question 21

Which option best describes a feedforward network?

Accepted Answer

Layers connected sequentially with no cycles.. Here, Layers connected sequentially with no cycles. is the right choice. Most basic deep architecture. This is the most accurate statement for which option best describes a feedforward network. A quick elimination of partially true options helps confirm it.

Question 22

What is the primary purpose of a feedforward network?

Accepted Answer

Layers connected sequentially with no cycles.. In this case, Layers connected sequentially with no cycles. is correct. Most basic deep architecture. This is the most accurate statement for what is the primary purpose of a feedforward. A quick elimination of partially true options helps confirm it.

Question 23

Which statement about a feedforward network is most accurate?

Accepted Answer

Layers connected sequentially with no cycles.. The best option here is Layers connected sequentially with no cycles.. Most basic deep architecture. This is the most accurate statement for which statement about a feedforward network is most. A quick elimination of partially true options helps confirm it.

Question 24

How is a feedforward network best characterized?

Accepted Answer

Layers connected sequentially with no cycles.. For this question, Layers connected sequentially with no cycles. is correct. Most basic deep architecture. This is the most accurate statement for how is a feedforward network best characterized. A quick elimination of partially true options helps confirm it.

Question 25

Which option best describes a CNN?

Accepted Answer

Network using convolutional layers for spatial data.. Network using convolutional layers for spatial data. is the correct answer here. Excels on images. This is the most accurate statement for which option best describes a cnn. A quick elimination of partially true options helps confirm it.

Question 26

What is the primary purpose of a CNN?

Accepted Answer

Network using convolutional layers for spatial data.. Here, Network using convolutional layers for spatial data. is the right choice. Excels on images. It aligns directly with what the question asks about what is the primary purpose of a cnn. The other options are either incomplete or contextually incorrect.

Question 27

Which statement about a CNN is most accurate?

Accepted Answer

Network using convolutional layers for spatial data.. In this case, Network using convolutional layers for spatial data. is correct. Excels on images. It aligns directly with what the question asks about which statement about a cnn is most accurate. The other options are either incomplete or contextually incorrect.

Question 28

How is a CNN best characterized?

Accepted Answer

Network using convolutional layers for spatial data.. The best option here is Network using convolutional layers for spatial data.. Excels on images. It aligns directly with what the question asks about how is a cnn best characterized. The other options are either incomplete or contextually incorrect.

Question 29

Which option best describes an RNN?

Accepted Answer

Network processing sequences via recurrent state.. For this question, Network processing sequences via recurrent state. is correct. Earlier sequence model paradigm. It aligns directly with what the question asks about which option best describes an rnn. The other options are either incomplete or contextually incorrect.

Question 30

What is the primary purpose of an RNN?

Accepted Answer

Network processing sequences via recurrent state.. Network processing sequences via recurrent state. is the correct answer here. Earlier sequence model paradigm. It aligns directly with what the question asks about what is the primary purpose of an rnn. The other options are either incomplete or contextually incorrect.

Question 31

Which statement about an RNN is most accurate?

Accepted Answer

Network processing sequences via recurrent state.. Here, Network processing sequences via recurrent state. is the right choice. Earlier sequence model paradigm. This matches the core idea being tested around which statement about an rnn is most accurate. The other options are either incomplete or contextually incorrect.

Question 32

How is an RNN best characterized?

Accepted Answer

Network processing sequences via recurrent state.. In this case, Network processing sequences via recurrent state. is correct. Earlier sequence model paradigm. This matches the core idea being tested around how is an rnn best characterized. The other options are either incomplete or contextually incorrect.

Question 33

Which option best describes an LSTM?

Accepted Answer

RNN variant with gating to mitigate vanishing gradients.. The best option here is RNN variant with gating to mitigate vanishing gradients.. Long short-term memory. This matches the core idea being tested around which option best describes an lstm. The other options are either incomplete or contextually incorrect.

Question 34

What is the primary purpose of an LSTM?

Accepted Answer

RNN variant with gating to mitigate vanishing gradients.. For this question, RNN variant with gating to mitigate vanishing gradients. is correct. Long short-term memory. This matches the core idea being tested around what is the primary purpose of an lstm. The other options are either incomplete or contextually incorrect.

Question 35

Which statement about an LSTM is most accurate?

Accepted Answer

RNN variant with gating to mitigate vanishing gradients.. RNN variant with gating to mitigate vanishing gradients. is the correct answer here. Long short-term memory. This matches the core idea being tested around which statement about an lstm is most accurate. The other options are either incomplete or contextually incorrect.

Question 36

How is an LSTM best characterized?

Accepted Answer

RNN variant with gating to mitigate vanishing gradients.. Here, RNN variant with gating to mitigate vanishing gradients. is the right choice. Long short-term memory. That is exactly the concept behind how is an lstm best characterized in this context. The other options are either incomplete or contextually incorrect.

Question 37

Which option best describes a Transformer?

Accepted Answer

Architecture based on self-attention.. In this case, Architecture based on self-attention. is correct. Backbone of modern LLMs. That is exactly the concept behind which option best describes a transformer in this context. The other options are either incomplete or contextually incorrect.

Question 38

What is the primary purpose of a Transformer?

Accepted Answer

Architecture based on self-attention.. The best option here is Architecture based on self-attention.. Backbone of modern LLMs. That is exactly the concept behind what is the primary purpose of a transformer in this context. The other options are either incomplete or contextually incorrect.

Question 39

Which statement about a Transformer is most accurate?

Accepted Answer

Architecture based on self-attention.. For this question, Architecture based on self-attention. is correct. Backbone of modern LLMs. That is exactly the concept behind which statement about a transformer is most accurate in this context. The other options are either incomplete or contextually incorrect.

Question 40

How is a Transformer best characterized?

Accepted Answer

Architecture based on self-attention.. Architecture based on self-attention. is the correct answer here. Backbone of modern LLMs. That is exactly the concept behind how is a transformer best characterized in this context. The other options are either incomplete or contextually incorrect.

Question 41

Which option best describes self-attention?

Accepted Answer

Each token attends to others to mix context.. Here, Each token attends to others to mix context. is the right choice. Key mechanism in Transformers. It fits the requirement in the prompt about which option best describes self-attention. The other options are either incomplete or contextually incorrect.

Question 42

What is the primary purpose of self-attention?

Accepted Answer

Each token attends to others to mix context.. In this case, Each token attends to others to mix context. is correct. Key mechanism in Transformers. It fits the requirement in the prompt about what is the primary purpose of self-attention. The other options are either incomplete or contextually incorrect.

Question 43

Which statement about self-attention is most accurate?

Accepted Answer

Each token attends to others to mix context.. The best option here is Each token attends to others to mix context.. Key mechanism in Transformers. It fits the requirement in the prompt about which statement about self-attention is most accurate. The other options are either incomplete or contextually incorrect.

Question 44

How is self-attention best characterized?

Accepted Answer

Each token attends to others to mix context.. For this question, Each token attends to others to mix context. is correct. Key mechanism in Transformers. It fits the requirement in the prompt about how is self-attention best characterized. The other options are either incomplete or contextually incorrect.

Question 45

Which option best describes multi-head attention?

Accepted Answer

Multiple attention heads run in parallel.. Multiple attention heads run in parallel. is the correct answer here. Captures diverse relationships. It fits the requirement in the prompt about which option best describes multi-head attention. The other options are either incomplete or contextually incorrect.

Question 46

What is the primary purpose of multi-head attention?

Accepted Answer

Multiple attention heads run in parallel.. Here, Multiple attention heads run in parallel. is the right choice. Captures diverse relationships. This is the most accurate statement for what is the primary purpose of multi-head attention. The other options are either incomplete or contextually incorrect.

Question 47

Which statement about multi-head attention is most accurate?

Accepted Answer

Multiple attention heads run in parallel.. In this case, Multiple attention heads run in parallel. is correct. Captures diverse relationships. This is the most accurate statement for which statement about multi-head attention is most accurate. The other options are either incomplete or contextually incorrect.

Question 48

How is multi-head attention best characterized?

Accepted Answer

Multiple attention heads run in parallel.. The best option here is Multiple attention heads run in parallel.. Captures diverse relationships. This is the most accurate statement for how is multi-head attention best characterized. The other options are either incomplete or contextually incorrect.

Question 49

Which option best describes positional encoding?

Accepted Answer

Encodes order info for attention models.. For this question, Encodes order info for attention models. is correct. Required since attention is permutation-invariant. This is the most accurate statement for which option best describes positional encoding. The other options are either incomplete or contextually incorrect.

Question 50

What is the primary purpose of positional encoding?

Accepted Answer

Encodes order info for attention models.. Encodes order info for attention models. is the correct answer here. Required since attention is permutation-invariant. This is the most accurate statement for what is the primary purpose of positional encoding. The other options are either incomplete or contextually incorrect.

AI Deep Learning Basics MCQ Questions with Answers (Latest 2026)

Q1. Which option best describes a neuron?

Q2. What is the primary purpose of a neuron?

Q3. Which statement about a neuron is most accurate?

Q4. How is a neuron best characterized?

Q5. Which option best describes an activation function?

Q6. What is the primary purpose of an activation function?

Q7. Which statement about an activation function is most accurate?

Q8. How is an activation function best characterized?

Q9. Which option best describes ReLU?

Q10. What is the primary purpose of ReLU?

Q11. Which statement about ReLU is most accurate?

Q12. How is ReLU best characterized?

Q13. Which option best describes softmax?

Q14. What is the primary purpose of softmax?

Q15. Which statement about softmax is most accurate?

Q16. How is softmax best characterized?

Q17. Which option best describes backpropagation?

Q18. What is the primary purpose of backpropagation?

Q19. Which statement about backpropagation is most accurate?

Q20. How is backpropagation best characterized?

Q21. Which option best describes a feedforward network?

Q22. What is the primary purpose of a feedforward network?

Q23. Which statement about a feedforward network is most accurate?

Q24. How is a feedforward network best characterized?

Q25. Which option best describes a CNN?

Q26. What is the primary purpose of a CNN?

Q27. Which statement about a CNN is most accurate?

Q28. How is a CNN best characterized?

Q29. Which option best describes an RNN?

Q30. What is the primary purpose of an RNN?

Q31. Which statement about an RNN is most accurate?

Q32. How is an RNN best characterized?

Q33. Which option best describes an LSTM?

Q34. What is the primary purpose of an LSTM?

Q35. Which statement about an LSTM is most accurate?

Q36. How is an LSTM best characterized?

Q37. Which option best describes a Transformer?

Q38. What is the primary purpose of a Transformer?

Q39. Which statement about a Transformer is most accurate?

Q40. How is a Transformer best characterized?

Q41. Which option best describes self-attention?

Q42. What is the primary purpose of self-attention?

Q43. Which statement about self-attention is most accurate?

Q44. How is self-attention best characterized?

Q45. Which option best describes multi-head attention?

Q46. What is the primary purpose of multi-head attention?

Q47. Which statement about multi-head attention is most accurate?

Q48. How is multi-head attention best characterized?

Q49. Which option best describes positional encoding?

Q50. What is the primary purpose of positional encoding?