I have learned many important things whilst conducting the individual project (project 2) as part of my course work for the subject Probability & Statistical Data Analysis (PSDA). This is an individual project which was assigned to us instead of our finals due to the COVID-19 pandemic. Since this project is an individual project it took quite some time and effort to understand and interpret the data into meaningful information. The project specification was to use a secondary data acquired from an online source and than with the acquired data to make an inferential statistics. The topic i have choose for this project is " Students Performance in Exams".
There were many new concepts that i have learned and implement throughout the entirety of this project. Some concepts that i have described in my project are regression, hypothesis testing, chi square test and correlation. At first, i felt overwhelmed with the task at hand but i managed to tackle each task and in the end i managed to finish the project with ease. This would not have been possible without the help and guidance from my PSDA lecturer, DR.Chan Weng Howe. There were many hardships and challenges that i have faced whist conducting this project. One of the major problems i faced doing this project was to choose the suitable datasets from the vast numbers of datasets on the internet. I also had a hard time choosing the suitable variable to use for the test that i will be conducting with said datasets. Once that steps was done, the next challenge i faced was plotting the graph using R Studio. It took quite some time to complete all the graphs and calculation using R Studio. The process was made easy for me from the help of YouTube and Stack Overflow to help me plot the graphs. I have choose 40 random samples from my datasets to conduct this experiment. The data set was retried from Kaggle, an online opensource data source. After completing all the graph I began writing the report explaining the information I got from the data accordingly(inferential statistics). I than went through the entire report formatting correcting grammatical errors. I then got on to the third component of the project which was the presentation. Due to the current MCO, I was unable to present it in front of the class and have opted to present it through a video.
From this project i can conclude a coupe of thing in regard with my topic.
Based on the hypothesis, we fail to reject the null hypothesis. There is insufficient evidence to support the claim that the average maths score scored by students is not equal to 76.66 marks. Next, from the analysis, it is found that there is a strong positive relationship between the average writing score and average reading score with a correlation coefficient (r) of 0.9339426.The estimated regression model is then produced in which we obtain ŷ = 5.5995 + 0.9037x, and this regression model is helpful in predicting the average reading score based on average writing score. The conclusion which can be drawn from the chi square test is Since the p-value is > 0.05, therefore fail reject the null hypothesis As a conclusion, there is evidence of a relationship between gender and test preparedness.