Today I am going to speak about Hypothesis Testing which is frequently used by data scientists to:

- Test a particular idea
- Constructed an experiment to answer a particular question

**Table of Contents**

**Table of Contents**- Definition
- Significance and p-values
- Type of Errors
- Exercise
- Z-test
- Strengths & Weaknesses of Z-test
- Student's t-test
- Exercise
- Conclusion

## Definition

The goal of hypothesis testing is to ** rule out** the null. The results of a hypothesis test are two:

- Reject the null hypothesis (so something happened)
- Fail to reject the null hypothesis

**Examples**

*Step 1*: We have some idea about a situation:

- The drug cures the common cold.
- The evidence proves that you are guilty.
- The samples come from different populations

*Step 2: *Formulate a null hypothesis H_{0}:

- The drug has no effect.
- You are innocent
- The samples come from the same population

*Step 3*: Formulate an alternative hypothesis H_{a} != H_{0}

- The drug has an effect
- You are guilty
- The samples come from different populations

For the drug example the null hypothesis is: H_{0} the effect is due to random chance. However, if we manage to rule out the null does not confirm the effect was caused by the ‘treatment’!

I urge to have a look at this book for more examples on Hypothesis testing.

The Elements of Statistical Learning (Springer Series in Statistics)For people who prefer video course have a look on this online course:

Programming for Data Science## Significance and p-values

- A small p-value indicates strong evidence against the null hypothesis, you measured an improbable value. So the null hypothesis can be rejected.
- A large p-value indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.
- Marginal p-values (usually in the range 0.01 to 0.1) are generally inconclusive. This usually means you need to collect more data.

**Example**

- Experiment: Coin flipping
- Null Hypothesis H
_{0}: The coin is fair: P(heads) = P(tails) = ½ - Test-statistic = number of heads
- Result of 5 flips: HHHHH

P(HHHHH | H0) = (1/2)^5=1/32 ~ 0.03 = p-value - Biased coins are rare! Should have a high significance threshold.

**Remember**

The power of a test statistic depends on:

- Effect size: Easier to detect large effects!
- Sample size: Statistical tests get more powerful with more data
- Statistical significance: Increases the chance of rejecting the null hypothesis.

## Type of Errors

There are two main ways to be wrong with significance testing.

- Type 1 error = false positive. Reject a true null hypothesis.
- Type 2 error = false negative. Fail to reject a false null hypothesis

**Example 1**

- H
_{0}: the defendant is innocent. - H
_{a}: the defendant is guilty. - A false positive: imprison an innocent person.
- A false negative: let a guilty person go free.

**Example 2**

- H
_{0}: there is no wolf in the valley. - H
_{a}: there is a wolf in the valley. - A false positive: we thought there was a wolf when there was not.
- A false negative: we thought there was no wolf when there was.

## Exercise

In a coin flipping experiment we perform 7 flips and get HHHTHHT. Is the coin biased?

Bias coined are rare. The probability is quite high so we can conclude that the coin is not biased.

## Z-test

How to standardize data:

- Subtract the mean
- Divide by the standard deviation

The standard score for a male individual who is 170cm, from the general population which has mean of 175cm and std of 7 is:

If we have n measurements:

Let's find now the standard score of a basketball team (5 players) with mean height of 200cm:

- H
_{0}: Basketball players are a random sample of the general population. - H
_{a}: Basketball players are tall i.e. average height > 1.75m - ⍺ = 0.01

This is a very large deviation! This implies that basketball players are a significantly different population.

**Exercise**

Find the p-value associated with the Z-score where the population mean is 1.75m given a list `X`

containing the heights of the basketball players in Python?

```
import statsmodels.stats.weightstats as sms
from scipy.stats import norm
x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]
s = sms.ztest(x, value=1.75);
print(s)
print( 2*norm.sf(16.06960894924432) )
```

## Strengths & Weaknesses of Z-test

Strengths of z-test:

- Intuitive

Weaknesses of z-test:

- Need the true population mean
- Need the true population standard deviation

Usually, we only have access to the sample mean and standard deviation. In this case, a Student's t-test is appropriate.

## Student's t-test

Two sample t-test is often called “Student’s t-test” compares the means of two populations. The two sample test applies when:

- Equal number of measurements of two populations
- Tests the null hypothesis that the means of the populations from which the two samples were taken are equal.

A high overview of the maths involved:

There are exist two-tailed and one-tailed:

Usually, we used two-tailed tests but in the case our variables take only positive values (i.e heights, weights) we preferred one-tailed test.

## Exercise

Use scipy's t-test to compare the heights of the Golden State Warriors `X`

with the Cleveland Cavaliers `Y`

i.e. are they significantly different populations?

```
from scipy import stats
x = [2.06,2.08,1.88,1.91,
2.06,2.01,1.98,2.13,2.01,2.06,2.01,
2.13,2.11,2.01,2.06,2.01]
y = [1.91,1.96,2.06,1.91,
1.96,2.03,2.03,2.01,2.08,2.06,2.03,
2.08,1.88,1.98,2.06,2.11]
print( stats.ttest_ind(x, y, equal_var=True) )
print( stats.ttest_ind(x, y, equal_var=False) )
```

Another really good Python introduction book to machine learning is:

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent SystemsFor people who prefer video course have a look on this online course:

Machine Learning Engineer## Conclusion

This brings us to the end of this article. Hope you got a basic understanding of how a Hypothesis Test is used.

If you liked this article, please consider subscribing to my blog. That way I get to know that my work is valuable to you and also notify you for future articles.

Thanks for reading and I am looking forward to hearing your questions :)

*Stay tuned and Happy Machine Learning.*