PSDA Project 2

This is my reflection for Project 2 PSDA

Hypothesis Testing

For testing, we used the rating variables from our dataset to determine whether a product is satisfactory or not. To determine if a product is satisfactory, it must have a rating more than 50%. The null hypothesis is that 20% of the products are satisfactory. The alternative hypothesis is that 20% of the products are not satisfactory.

                                                                                                                                            H0 : p = 0.20

                                                                                                                                            H1 : p > 0.20

Based on the dataset, only 21 of the 77 products have a rating higher than 50%.

N = 77

p hat = 21/77 = 0.2727

p = 0.20

z = (p hat - p) / ( sqr(pq/N) )

   = (0.2727-0.20) / ( sqr[(0.20)(0.80)/(77)] )

   = 1.59

P(z > 1.59) = 0.0559

By using a 95% confidence level, we will compare the P-value with alpha value (significance level).

                                                                                                                                       P-value   |  alpha value

                                                                                                                                          0.0559 > 0.05

                                                                                                                                    H0 is failed to be rejected

Since that the P-value is more than the alpha value, H0 is failed to be rejected. There is sufficient evidence to support the claim that only 20% of the products are satisfactory. From here, we can conclude that our null hypothesis is true. This might be because that the minimum satisfactory rating of 50% is too high and thus leading to only 20% of products to meet the conditions. Alternatively, it might be because that our confidence level when comparing P-value with alpha value is too high which caused the null hypothesis to be rejected compared to if the confidence level was 90% instead.

My Reflection

I have learned many things throughout the process of conducting my project 2. I learned about how to apply the knowledge I have acquired from Chapter 5 (like Hypothesis Testing 1 Sample) until Chapter 8 through real life dataset and consider which test statistic to use during data analysis. On that note, it is important to make sure which variable in the dataset is the most useful for evaluation. In our case, the contents of each product are not as significant as the rating of the product since we are conducting a 1 sample test. To simplify the process, I have learned to use Excel to automatically highlight and count the products that meet the tested requirements. Due to this, I have learned that it is important for data preprocessing to be done in order to achieve a better evaluation. As for example, our dataset had a varied result for the rating variable and it is important to take note of where most of the values lie while also finding the outliers. This allows me to choose a good satisfactory level of products having more than 50% rating so that the results will be more reliable. In my opinion, the dataset that my groupmate, Syahir, chose is not the best of datasets to conduct tests on but we just continued with the dataset due to a lack of time and the lack of help from our third member. Even so, this project has made me realise the suitability and the proper time to use each tests that we have learned throughout the course. Overall, I am satisfied with the work and effort that we have gave to complete this project.