Hypothesis testing simplified

Wiki says - A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. In this blog we will try to understand this definition more intuitively with an example.

In this section we will try to understand following things:
    1. Why do we need hypothesis testing?
    2. Understanding hypothesis testing
    3. Problem on hypothesis testing


1. Why do we need hypothesis testing?

In the real world we do a lot of observations or experiments and we would like to validate those experiments.
For example:
    -whether the new drug is better than the previous one.
    -whether the drug had an impact on the growth rate of a population.
    -whether the new refrigerator significantly reduces the power consumption
    -whether market sentiment increased after the new budget announcement.


2. Understanding hypothesis testing

Let's take an example to understand hypothesis testing more intuitively: Apple has released a new version of the app. They have 2 distributions of user's-spend-time, corresponding to old and new versions respectively. They want to know whether a new version made any significant improvement over the user's spend time.
We will try to understand this problem in 5 steps:

step1: Following diagram shows the distribution for user spend time on both the versions of the app individually.

As you can see there are 2 distributions
    a) Version 1.0 user-spend-time distribution with mean = µ0
    b) Version 2.0 user-spend-time distribution with mean = µ1



Step2: From the problem statement we can understand that we want to see whether there is a significant improvement in the mean users-spend-time in the latest version. From this we already know that µ1 > µ0 but whether this improvement is significant or fluke is the question that we want to address.



Step3: Lets plot version 2.0 mean on version 1.0 distribution. It could be anywhere above µ0 but lets randomly select a place to plot it, which helps us to understand further concepts better

What does the above diagram or distribution implies?
    - We are trying to see where exactly the mean of version 2 is situated in version 1 distribution.
    - To put it in another words what is the probability of seeing µ1 in version 1.0 distribution
    - Mathematically
       P(x==µ1 in version 1.0 distribution)
As we can remember normal distributions are continuous distributions.. Probability at any point in normal distribution is always 0. So, instead of finding the probability of seeing µ1 in version 1 distribution as follows:
        P( x==µ1 in version 1.0 distribution)
we can find the probability of seeing µ1 or any greater value in version 1 distribution:
       P( x>=µ1 in version 1.0 distribution)
Why x>=µ1 ? To answer this question let's have a look at the above distribution. If µ1 is a significant improvement then anything above that would also make it a significant improvement too right? Hence, we are looking at this area:



Step4: We already know that the area under probability distribution is 1. In the version 1 distribution we really don't know where exactly µ1 is situated.

What it implies version 2 mean i.e µ1 being very close to version 1 mean i.e µ0 in version 1 distribution ?
Ans: It says the probability of seeing x>=µ1 in version 1 distribution is very high. In other words the if area under normal distribution corresponding to x=>µ1 is more then there is no significant change or difference Hence, version 2 is not a significant improvement over version 1

What it implies version 2 mean i.e µ1 being very far from version 1 mean i.e µ0 in version 1 distribution ?
Ans: Mathematically it says the probability of seeing x>=µ1 in version 1 distribution is very low. In other words the if area under normal distribution corresponding to x=>µ1 is less then there is a significant change or difference Hence, version 2 is a significant improvement over version 1



Step5: How do we know whether the area is more or less ?
For this we need a threshold. This threshold is usually called as significance value.
if area under normal distribution corresponding to x=>µ1 is more than threshold area we can say version 2 is not a significant improvement over version 1
else area under normal distribution corresponding to x=>µ1 is less than threshold area we can say version 2 is a significant improvement over version 1

area under normal distribution corresponding to o x=>µ1 is referred as p-value


3. Problem on hypothesis testing


Prerequisite alert: Following problem requires Understanding of "features of distributions like mean, median, mode, range, z-score, variance, standard deviation etc.." If you are not familiar with these please take a look at this blog - Features of any distribution

Example Problem: A teacher claims that the mean score of students in his class is greater than 82 with a standard deviation of 20. If a sample of 81 students was selected with a mean score of 90 then check if there is enough evidence to support this claim at a 0.05 significance level.

Solution: This is one tailed test. Since teacher is interested in whether observed mean is grater than population mean

    Null Hypothesis : Population mean = Sample mean
    Alt Hypothesis: Population mean < Sample mean

    population statistics: μ=82 σ=20
    sample statistics: μm=82 n=81
    significance level: α=0.05

    Calculating z-score: z = (x – μ) / (σ / √n)
             = ( 90-82) / ( 20/√81)
             z-score=3.6
    Hence observed mean is 3.6 standard error of mean above population mean

    Calculating p-value: We want to calculate the area under normal distribution which
           is above 3.6 SD away from mean
            p-value = Total area above mean - Total area till 3.6 SD above mean
               = 0.50 - ztable(3.6)
               = 0.50 - 0.4998
            p-value = 0.0002
    Hence p(x>= μm | Null hypothesis is true ) or p-value is 0.0002

    Conclusion:
        Since significance level or α > p-value we can reject null hypothesis
        Hence there is a significant improvement in the mean of score


References

https://www.youtube.com/watch?v=KS6KEWaoOOE
https://www.youtube.com/watch?v=5ABpqVSx33I
https://www.youtube.com/watch?v=-FtlH4svqx4
https://math.stackexchange.com/questions/1796478/sample-standard-deviation-given-population-standard-deviation
https://www.scribbr.com/methodology/population-vs-sample/#:~:text=A%20population%20is%20the%20entire,t%20always%20refer%20to%20people.
https://towardsdatascience.com/hypothesis-testing-z-scores-337fb06e26ab
https://onlinestatbook.com/2/sampling_distributions/samp_dist_mean.html
https://www.statisticshowto.com/probability-and-statistics/z-score/
https://www.analyticsvidhya.com/blog/2020/06/statistics-analytics-hypothesis-testing-z-test-t-test/
https://math.stackexchange.com/questions/504288/what-situation-calls-for-dividing-the-standard-deviation-by-sqrt-n
hhttps://www.cuemath.com/data/z-test/