Today I am going to speak about Hypothesis Testing which is frequently used by data scientists to:

• Test a particular idea
• Constructed an experiment to answer a particular question

• Definition
• Significance and p-values
• Type of Errors
• Exercise
• Z-test
• Strengths & Weaknesses of Z-test
• Student's t-test
• Exercise
• Conclusion

📣 Definition

The goal of hypothesis testing is to rule out the null. The results of a hypothesis test are two:

• Reject the null hypothesis (so something happened)
• Fail to reject the null hypothesis

Examples

Step 1: We have some idea about a situation:

• The drug cures the common cold.
• The evidence proves that you are guilty.
• The samples come from different populations

Step 2: Formulate a null hypothesis H0:

• The drug has no effect.
• You are innocent
• The samples come from the same population

Step 3: Formulate an alternative hypothesis Ha  != H0

• The drug has an effect
• You are guilty
• The samples come from different populations

For the drug example the null hypothesis is: H0 the effect is due to random chance. However,  if we manage to rule out the null does not confirm the effect was caused by the ‘treatment’!

📚 Significance and p-values

• A small p-value indicates strong evidence against the null hypothesis, you measured an improbable value. So the null hypothesis can be rejected.
• A large p-value indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.
• Marginal p-values (usually in the range 0.01 to 0.1) are generally inconclusive. This usually means you need to collect more data.

Example

• Experiment: Coin flipping
• Null Hypothesis H0: The coin is fair: P(heads) = P(tails) = ½
• Test-statistic = number of heads
• Result of 5 flips: HHHHH
P(HHHHH | H0) = (1/2)^5=1/32 ~ 0.03 = p-value
• Biased coins are rare! Should have a high significance threshold.

Remember

The power of a test statistic depends on:

• Effect size: Easier to detect large effects!
• Sample size: Statistical tests get more powerful with more data
• Statistical significance: Increases the chance of rejecting the null hypothesis.

❌ Type of Errors

There are two main ways to be wrong with significance testing.

• Type 1 error = false positive. Reject a true null hypothesis.
• Type 2 error = false negative. Fail to reject a false null hypothesis

Example 1

• H0: the defendant is innocent.
• Ha: the defendant is guilty.
• A false positive: imprison an innocent person.
• A false negative: let a guilty person go free.

Example 2

• H0: there is no wolf in the valley.
• Ha: there is a wolf in the valley.
• A false positive: we thought there was a wolf when there was not.
• A false negative: we thought there was no wolf when there was.

🚀 For people who like video courses and want to kick-start a career in data science today, I highly recommend the below video course from Udacity:

📚 While for book lovers:

🎳 Exercise

In a coin flipping experiment we perform 7 flips and get HHHTHHT. Is the coin biased?

Bias coined are rare. The probability is quite high so we can conclude that the coin is not biased.

📈 Z-test

How to standardize data:

• Subtract the mean
• Divide by the standard deviation

The standard score for a male individual who is 170cm, from the general population which has mean of 175cm and std of 7 is:

If we have n measurements:

Let's find now the standard score of a basketball team (5 players) with mean height of 200cm:

• H0: Basketball players are a random sample of the general population.
• Ha: Basketball players are tall i.e. average height > 1.75m
• ⍺ = 0.01

This is a very large deviation! This implies that basketball players are a significantly different population.

Exercise

Find the p-value associated with the Z-score where the population mean is 1.75m given a list X containing the heights of the basketball players in Python?

import statsmodels.stats.weightstats as sms
from scipy.stats import norm

x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]

s = sms.ztest(x, value=1.75);
print(s)
print( 2*norm.sf(16.06960894924432) )

🥊 Strengths & Weaknesses of Z-test

Strengths of z-test:

• Intuitive

Weaknesses of z-test:

• Need the true population mean
• Need the true population standard deviation

Usually, we only have access to the sample mean and standard deviation. In this case, a Student's t-test is appropriate.

🧩 Student's t-test

Two sample t-test is often called “Student’s t-test” compares the means of two populations. The two sample test applies when:

• Equal number of measurements of two populations
• Tests the null hypothesis that the means of the populations from which the two samples were taken are equal.

A high overview of the maths involved:

There are exist two-tailed and one-tailed:

Usually, we used two-tailed tests but in the case our variables take only positive values (i.e heights, weights) we preferred one-tailed test.

🎮 Exercise

Use scipy's t-test to compare the heights of the Golden State Warriors X with the Cleveland Cavaliers Y i.e. are they significantly different populations?

from scipy import stats

x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]

y = [1.91,1.96,2.06,1.91,
1.96,2.03,2.03,2.01,2.08,2.06,2.03,
2.08,1.88,1.98,2.06,2.11]

print( stats.ttest_ind(x, y, equal_var=True) )
print( stats.ttest_ind(x, y, equal_var=False) )

🤖 Conclusion

This brings us to the end of this article. Hope you got a basic understanding of how a Hypothesis Test is used.

Thanks for reading; if you liked this article, please consider subscribing to my blog. That way I get to know that my work is valuable to you and also notify you for future articles.‌