Probability and Statistic Data Analysis (SCSI2143) was taught by Dr. Suhailah binti Mohamad Yusuf during my first year second semester. Throughout the semester, we were given two project that require us to involve with statistic and interpretation as well as doing statistical test.
For the first project, we are acquired to collect our own data known as primary data according to our group topic which is type of favorite sports among UTM students. Briefly speaking, before we start to find respondents, we first must construct questions that we want to ask to our respondents. The categories that divides the question or data that we are aiming have 4 types, that is nominal data, ordinal data, ratio data and interval data. Based on these types of data chosen will help to determine the type of data representation. Whether it is in tabular or graphical form that include from the simple bar chart to stem-and-left to box plot.
To convert the data to graphical form, we were introduced to R Studio application that was created for statistical studies. Using this software, we can form bar and pie chart, histogram, stem-left, box plot and many more by just entering the data and call up the related function. So, based on the diagrams that represent the data that we have collected, we can proceed with interpretation of data. This is where we can see number of students that play sports in a day, what type of sports that they usually play, which college do they play the sports and how is the sports facilities provided by their colleges.
For the second project, we need to do some statistical testing from the secondary data that we found thru the internet or reading materials. My group and I have selected 2 type of data that is the causes of death in France and the effect of bodyfat on weight. Different type of data has different type of testing procedure. For example, for one sample and two sample of data, we can do the hypothesis testing to check whether the testing accept null hypothesis or alternative hypothesis that we want to study based on the given data. Next, we can see the different of two sample data by correlation analysis and linear regression model to identify the data is related with each other or independent data and are the data linearly relationship or not especially for the effect of bodyfat to the weight. Other then that, we also have done the chi square for two-way contingency table for the causes of death in France in 2001.
From overall assessment, if I must evaluate myself, I would say that I understand what type of testing and technique that can be done for the given data and know how to do the testing. However, the lacking aspect on me is that I do not understand what the testing is for other than to see if it satisfied the null hypothesis or not. Like what I want to find if I do this certain testing or the objective of the testing in this term is form the data. There are so many things that I need to give attention to in order to increase my understanding in this subject.
Thank is all from me. Thank you for reading my reflection ^u^
