Today I am going to speak about Hypothesis Testing which is frequently used by data scientists to:

  • Test a particular idea
  • Constructed an experiment to answer a particular question

Table of Contents

  • Definition
  • Significance and p-values
  • Type of Errors
  • Exercise
  • Z-test
  • Strengths & Weaknesses of Z-test
  • Student's t-test
  • Exercise
  • Conclusion
Instagram: @harry.digital. Evening above the city at Crestaurant, Perth, Western Australia.
Photo by Harry Cunningham / Unsplash

Definition

The goal of hypothesis testing is to rule out the null. The results of a hypothesis test are two:

  • Reject the null hypothesis (so something happened)
  • Fail to reject the null hypothesis

Examples

Step 1: We have some idea about a situation:

  • The drug cures the common cold.
  • The evidence proves that you are guilty.
  • The samples come from different populations

Step 2: Formulate a null hypothesis H0:

  • The drug has no effect.
  • You are innocent
  • The samples come from the same population

Step 3: Formulate an alternative hypothesis Ha  != H0

  • The drug has an effect
  • You are guilty
  • The samples come from different populations

For the drug example the null hypothesis is: H0 the effect is due to random chance. However,  if we manage to rule out the null does not confirm the effect was caused by the ‘treatment’!

I urge to have a look at this book for more examples on Hypothesis testing.

The Elements of Statistical Learning (Springer Series in Statistics)

For people who prefer video course have a look on this online course:

Programming for Data Science

Significance and p-values

  • A small p-value indicates strong evidence against the null hypothesis, you measured an improbable value. So the null hypothesis can be rejected.
  • A large p-value indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.
  • Marginal p-values (usually in the range 0.01 to 0.1) are generally inconclusive. This usually means you need to collect more data.

Example

  • Experiment: Coin flipping
  • Null Hypothesis H0: The coin is fair: P(heads) = P(tails) = ½
  • Test-statistic = number of heads
  • Result of 5 flips: HHHHH
    P(HHHHH | H0) = (1/2)^5=1/32 ~ 0.03 = p-value
  • Biased coins are rare! Should have a high significance threshold.

Remember

The power of a test statistic depends on:

  • Effect size: Easier to detect large effects!
  • Sample size: Statistical tests get more powerful with more data
  • Statistical significance: Increases the chance of rejecting the null hypothesis.

Type of Errors

There are two main ways to be wrong with significance testing.

  • Type 1 error = false positive. Reject a true null hypothesis.
  • Type 2 error = false negative. Fail to reject a false null hypothesis

Example 1

  • H0: the defendant is innocent.
  • Ha: the defendant is guilty.
  • A false positive: imprison an innocent person.
  • A false negative: let a guilty person go free.

Example 2

  • H0: there is no wolf in the valley.
  • Ha: there is a wolf in the valley.
  • A false positive: we thought there was a wolf when there was not.
  • A false negative: we thought there was no wolf when there was.

Exercise

In a coin flipping experiment we perform 7 flips and get HHHTHHT. Is the coin biased?

Bias coined are rare. The probability is quite high so we can conclude that the coin is not biased.


Z-test

How to standardize data:

  • Subtract the mean
  • Divide by the standard deviation

The standard score for a male individual who is 170cm, from the general population which has mean of 175cm and std of 7 is:

If we have n measurements:

Let's find now the standard score of a basketball team (5 players) with mean height of 200cm:

  • H0: Basketball players are a random sample of the general population.
  • Ha: Basketball players are tall i.e. average height > 1.75m
  • ⍺ = 0.01

This is a very large deviation! This implies that basketball players are a significantly different population.

Exercise

Find the p-value associated with the Z-score where the population mean is 1.75m given a list X containing the heights of the basketball players in Python?

import statsmodels.stats.weightstats as sms 
from scipy.stats import norm 

x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]

s = sms.ztest(x, value=1.75); 
print(s)
print( 2*norm.sf(16.06960894924432) )

Strengths & Weaknesses of Z-test

Strengths of z-test:

  • Intuitive

Weaknesses of z-test:

  • Need the true population mean
  • Need the true population standard deviation

Usually, we only have access to the sample mean and standard deviation. In this case, a Student's t-test is appropriate.


Student's t-test

Two sample t-test is often called “Student’s t-test” compares the means of two populations. The two sample test applies when:

  • Equal number of measurements of two populations
  • Tests the null hypothesis that the means of the populations from which the two samples were taken are equal.

A high overview of the maths involved:

There are exist two-tailed and one-tailed:

Usually, we used two-tailed tests but in the case our variables take only positive values (i.e heights, weights) we preferred one-tailed test.


Exercise

Use scipy's t-test to compare the heights of the Golden State Warriors X with the Cleveland Cavaliers Y i.e. are they significantly different populations?

from scipy import stats

x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]

y = [1.91,1.96,2.06,1.91,
1.96,2.03,2.03,2.01,2.08,2.06,2.03,
2.08,1.88,1.98,2.06,2.11]

print( stats.ttest_ind(x, y, equal_var=True) )
print( stats.ttest_ind(x, y, equal_var=False) )

Another really good Python introduction book to machine learning is:

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

For people who prefer video course have a look on this online course:

Machine Learning Engineer

Conclusion

This brings us to the end of this article. Hope you got a basic understanding of how a Hypothesis Test is used.

‌If you liked this article, please consider subscribing to my blog. That way I get to know that my work is valuable to you and also notify you for future articles.‌
‌Thanks for reading and I am looking forward to hearing your questions :)‌
Stay tuned and Happy Machine Learning.