Introduction
In this project, my team member and I conducted an inference statistical analysis based on a dataset provided by lecturer. The dataset is a secondary data which approach to the countries in the world. In the raw dataset, it contains 198 countries with 2 duplicate countries which are China and India. We proceed to data cleaning process by removing these 2 duplicate data which located at row 52 and row 114. There are total of 196 countries will be used in this project. The original dataset is made up of 5 variables which consists of population(country population), growth(growth rate of the country), under 15(percentage of population below age of 15), life expectancy(average life span of the population in that country) and mortality(death rate of the country). 2 variables are chosen to be used in this project which are life expectancy and mortality out of those 5 variables.
The current age of technology is getting more and more modernized. The risk of giving birth is lower due to the medical technology is better. So, the fertility rate has increased while the death rate during birth is lower. Hence, the population for each of the country is increasing over time, as death rates are now lower. The growth rate of a country usually picked as a reference for their population distribution. However, there are many more factors which can be used to define the country population distribution. Thus, we should consider other factors that may result in the fluctuation of population. The purpose of this project is to investigate the relationship between life expectancy and mortality of the world population. By carrying out this project, we are able to understand the relationship between the mentioned variables and how they influence each other.
Course Lecturer
Dr. Sharin Hazlin Huspi
sharin@utm.my
Group - The Admirals
Member:
1. Lai Yee Jen (That's me!)
2. Tham Chuan Yew
3. Chong Tung Han
4. Zhao Xin
Reflection
First of all, I am very appreciative to my groupmates. We managed to work as a team and brought success in terms of the project's execution and completion. In this project, I have learnt tons of new things especially in the field of implementing R programming language. We managed to pick a desired dataset and processed the raw data without any hurdles. By referring to the slides and videos showcased by my lecturer, I have learnt how to implement the R programming language to calculate the test statistic needed in our analysis. I have acquired skill in utilizing the analysis in the project as well as in my future assignment or project. The analysis we did in this project are hypothesis testing, correlation analysis, regression analysis and chi square test of independence. During the completion of project, I have also learnt the importance of time management. We were able to distribute the task evenly and complete it before the time limit ends. Last but not least, I would like to convey my sincere gratitude and thanks to my lecturer, Dr. Sharin Hazlin Huspi for guiding us throughout this project as well as the semester.