PSDA Project 2 Reflection
This is the second PSDA project given to us and it is an individual task this time. We were allowed to choose our own topics and datasets, which have to be a secondary data, that will be used for inferential statistics during this project. We were given more freedom and control in this project but at the same time us, the student will have to take more aspects into consideration throughout the whole project.
First of all, the choosing of the dataset. The dataset (Statistik Jenayah Malaysia 2019) that I’m using for this project right now is not my first choice, in fact this is the third dataset that I chose. Before anything else, I make sure I take a glance at the lesson slides and look online to understand more about inferential statistics and how each test should be done to avoid unnecessary mistakes as much as possible. At first I was overwhelmed by all those interesting datasets that can be found online, but when I took a closer look I found that those datasets didn’t fulfill all the requirements for this project and the common issues among those dataset are either too organise with limited data I can make use of or the other way around where it doesn’t have enough information or data that I can done the test with in this project. Finally I settle down with these datasets which I find interesting and at the same time there are plenty of data I can make use of. Also by the time when I really get into doing all those tests, I have also made several changes and selection of test to be done as I found that some task I proposed earlier are not suitable to be tested.
Regarding the coding, since this is not the first time anymore using R studio as I have been comfortable using R during the first project. Hence the coding for testing this time is fairly easy. R is indeed a very convenient open source language for statistics as it has a lot of packages and a huge community which ease up all the jobs. Still, this project gives me an opportunity to use R once again and learn more about it as I usually just code in C++.
In brief, I have benefited a lot from this project, and it once again proves that we feel things differently than we usually do when we really get into doing it ourselves. We really need to have a firm understanding about what the whole testing is about in order to pick the right data for the right test. This definitely an effective way other than conducting an examination on those topics, even I personally this is a better way of examination to see the understanding of students regarding those knowledge. Lastly, I would like to take this opportunity to thank my lecturer, Dr. Suhaila Mohamad Yusuf for all her guidance and patience throughout not only this project, but this whole semester.