SECI2143 - 02 PROBABILITY & STATISTICAL DATA ANALYSIS

Reflection on Project 2

After completing my part for the second project of Probability & Statistical Data Analysis course, I was given the chance to sit down and reflect on the process of conducting and presenting this project, along with what I learned during the completion of this project. For this project, my teammates and I were required to conduct an inference statistical analysis based on our chosen dataset. To conduct the analysis, we have chosen two datasets which we have obtained from two different websites, Data World and Science Direct.  

The title of the first dataset chosen is ‘Instagram Most Followed’. This dataset is a secondary data source that we have retrieved from Data World website. It shows the top 100 Instagram users, that includes brands and celebrities. We have obtained information regarding eight different variables which include Rank, Brand, Categories 1, Categories 2, Followers (Millions), Engagement Rates, iPosts on Hashtag, and Media Posted. Before we were able to proceed and continue using this dataset, one of my group members volunteered to ask our lecturer, Dr Nor Azizah Ali, whether this dataset is suitable for our analysis, which she approved. Dr Nor Azizah Ali has been guiding us throughout the completion of this project and she has definitely been a great influence to all of us. After the approval of our lecturer, my teammates and I held several online discussions to discuss on how we will be conducting the analysis, along with task division. Each of us has been assigned one type of data analysis to be conducted. I was responsible to do the regression test. The second dataset chosen is a secondary data source retrieved from the Science Direct website. The title of this dataset is ‘Johor Violent Crimes by Years’ and the main reason why we needed to choose an additional dataset is because the first dataset was not suitable for conducting Chi Square Test of Independence. This dataset shows the type of violent crimes that happened in Johor from the year 2006 to 2017. After successfully conducting the analysis, we have presented our data by producing a short video of us explaining our findings.

During the completion of this project, I have initially encountered several misunderstandings of how I should be conducting the analysis, but with the guidance of my lecturer and fellow friends, I have gained more clarity and understanding. My teammates and I would frequently ask each other questions regarding this project so that we stay on the right track. We helped each other every time any of us encountered problems. One important trait that I learned while conducting the analysis is definitely problem-solving skills. I had the opportunity to learn how to use the R programming software to complete this project. R programming software is an effective tool that everyone can use for data analysis.

To conclude my reflection, I can happily say that this project has helped me improved and enhanced my problem-solving skills. The most important part is I got to improve my ability to code in R programming.