Statistics Probability and Data Analysis

Reflection on Project 2

Due to Covid-19 pandemic, we were assigned an individual project as an alternative for extra credits. The project that we had to do was a bit similar with the first group project that we've done. In this project, we have to do a statistical analysis on a secondary data.

It was very hard to look for a good raw dataset. I kept changing my choice on the datasets that I wanted to analyse. In the end, I finally found a dataset that can be analysed as it has many variables (16) and more than 40 samples. The dataset that I used was about the nutritional values contained inside cereals. I collected the datasets from the website Kaggle, https://www.kaggle.com/crawford/80-cereals . I was reluctant to analyse these datasets as it didn't sound like a serious issue to discuss about. However, after doing a couple of research I managed to get some ideas on how to analyse these datasets.

The analytical tests that I’ve done were 1-sample hypothesis test, correlation test, regression test and chi square goodness of fit test. During the analysis, I changed some variables chosen. For example, in the proposal, initially I chose fibre content as one of the variables for correlation test but I chose sugar content instead because I felt like there are more things that I can discuss about. I took about two days to do the data analysis in RStudio as I found many alternative ways to do the tests. I also had to make sure the numbers I got from RStudio is the same when I calculated manually. Other than that, I was having a problem with importing the datasets. Since the datasets that I got was in .csv file, I had to convert it to .xlsx file so I can import it easily without having to use another function.

 

Writing the report was the challenging part because I’m not very good at writing especially in the discussion part. After I finished writing the discussion part, I referred the rubrics to get more inspiration on what I could add to the discussion. I realised I made a lot of mistakes by using some words that should not be used like “It is proven…”. After that, I just erased my discussions and started again from the top. I tried my very best to write a clear discussion and explanation without giving unwarranted conclusions.

For the video presentation, I decided to record a normal video by using a camera instead of using a PowerPoint slides video. After the video is recorded, I edited it on Sony Vegas Pro to add some graphical explanations in the video.

edit (2).png

From this project, I learned how important it is to ask your lecturer if you are not sure about something. I asked questions from the lecturer so many times that I felt guilty for it. If I hadn’t asked him about the datasets that can be used, I may choose the wrong ones and end up having to do the proposal all over again. I also learned the concept of statistical tests that I’ve done. Just by reading the slides, I couldn’t quite understand the concept and ow it can be used in our daily lives. After doing this project, I finally understand the purpose and practicality of the tests.