Question:medium

State whether Selection Bias occurs during data collection or during model deployment.

Show Hint

\textbf{Remember:} Selection Bias = Problem in sampling/data collection.
Updated On: Feb 23, 2026
Show Solution

Solution and Explanation

Occurrence of Selection Bias:

Selection Bias occurs primarily during data collection. It arises when the sampled data is not representative of the overall population, leading to biased or skewed results in statistical analysis or machine learning models.

Key Points:
1️⃣ During Data Collection:
• If certain groups or categories are overrepresented or underrepresented in the collected dataset, selection bias occurs.
• Examples: Surveying only urban populations, excluding a demographic, or using a self-selected sample.

2️⃣ During Model Deployment:
• While selection bias originates from the dataset, its effects can propagate during model deployment.
• The model may perform poorly for underrepresented groups if the training data was biased.
• However, the bias itself is not created during deployment; it stems from the data used to train the model.

Example:
• A hiring algorithm trained on historical data where only male candidates were predominantly hired can develop a biased model. The bias occurs due to non-representative training data (selection bias during data collection), not because of the model deployment.

Conclusion:
Selection Bias occurs during data collection when the dataset is not representative of the population, though its effects may be visible during model deployment. Ensuring proper sampling and balanced datasets helps mitigate this bias.
Was this answer helpful?
0