SECI2143-10 PROBABILITY & STATISTICAL DATA ANALYSIS

Project 2

This is the page created to show the reports and reflections of project 2.

Title: Cancer Cases and Deaths

See the source image

Introduction

“Cancer is the second leading cause of death globally and is responsible for an estimated 9.6 million deaths in the year of 2018. Globally, about 1 in 6 deaths is due to cancer.” reported by world health organization (WHO). It is a generic term for a large group of diseases that can affect any part of our body. It is a genetic disease caused by the changes of genes that control the way our cells function, especially how they grow and divide. Genetic changes that cause cancer can be inherited from our parents. They may also arise during a person’s lifetime as a result of errors that occur as cell divide or because of damage to DNA caused by the environmental exposures. Cancer causing environmental exposures include substances, such as chemicals in tobacco smoke, radiation such as ultraviolet (UV) rays from the sun, alcohol use and also unhealthy diet. Hence, I decided to conduct a study on cancer cases and deaths to investigate the trend of cancer between the years 1999 to 2016. Other than that, this study may also alert people on the level of seriousness of cancers to avoid the risk factors of getting cancers.

 

Reflection

There are many things that I had learned from this project. When I first started the project, it quite difficult. This is because I was wondering about the data set that I had to choose while doing the proposal. I struggled with it for a few days and finally, I decided to pick the topic about cancers. As I had mentioned in the project report, cancer was the second leading cause of death globally after heart disease. However, many cancers have a high chance of cure if diagnosed and treated early. Thus, I hoped that I am able to alert people on cancers through the study, to know more about them and avoid them. 

The statistical analysis was done using R Studio. It was challenged because I had to learn it on my own. I had searched for many references and videos online to gain knowledge about R language. For example, t.test() used to performs one and two sample hypothesis tests, cor.test() used to test the association between paired samples, lm() used to carry out regression, chisq.test to perform chi-squared contingency table tests and aov for ANOVA test. 

Due to the infection COVID-19, all of the learning activities have to conduct online. But I am lucky that I still able to catch up with the syllabus. This project was beneficial as it helps me to sum up all the topics that I learn in inferential statistics. It makes me more understanding of the use of correlation, regression and ANOVA. Finally, I hope that everyone gives more attention to cancer diseases. Life is too short to wake up in the morning with regrets. So, keep a healthy lifestyle and appreciate the world around you. 

 

Conclusion

In conclusion, the number of new cancers in a year consider serious. Based on the test statistic done at 5% significance level, I am able to claim that the mean of the new cancers by area in the United States was 30,000 cases. Besides, there is a positive linear relationship between the annual number of cancer death and the annual number of new cancers (1999-2016) in the United States. Whereas the annual rates of cancer deaths are affected by the annual rates of new cancers. In some cases, even though the rate is going down, the number of new cases and deaths is going up. This may happen due to the size of the population is growing and aging each year. Other than that, I have concluded that the number of cancer deaths by gender (male, female) and the number of cancer deaths by races (White, Black, American Indian/Alaska Native, Asian/Pacific Islander, Hispanic) were dependent. Finally, I have sufficient evidence to claim that the mean for the number of cancer deaths by different risk factors (alcohol, tobacco, obesity) was the same. Based on the study, cancer cases are impacted by changes in exposure to risk factors. It shows some of the cancer rates are going down. However, to maintain this situation, strategies and even precaution steps have to take earlier to avoid the risk factors. Besides that, cancer burden can also be reduced through early detection of cancer and the management of patients who develop cancer. Many cancers have a high chance of cure if diagnosed early and treated adequately.

Presentation video