SECI12143-05 PROBABILITY & STATISTICAL DATA ANALYSIS

Welcome to this page full statistical values and projects with real life datasets implementation :) This subject marks the importance of mathematics in statistics and how it helps the engineer to predict the data flow in order to obtain a significant profit :)

INTRODUCTION

This course consists of 8 subtopic which are:

1. Introduction to Statistics
2. Data Description
3. Descriptive statistics
4. Probability
5. Hypothesis Testing (Point estimation, 1 Sample and 2 Sample)
6. Chi-Square Test & Contingency Analysis
7. Correlation and Regression Analysis
8. Analysis of Variance (ANOVA)

There are 2 projects involved throughout the entire course as Project 1 implements the concepts from Chapter 1 to 4 while Project 2 implements concepts from Chapter 5 to 8

PROJECT 2 VIDEO

SUMMARY

This course has taught me a lot of important things that is related to computer science especially the use of statistics graphs. Each graphs represent different aspect of analysis and its uses differ based on the type of measurement for the dataset. 

In project 1, I learn a lot on how to implement different graphs on the same dataset. This allows us to analyze further the data flow and how it helps us as the student to understand what are the objective of the research. Moreover, I also learn what questions need to be asked in a survey for data collection because every questions have different element of measurements which determines which graph to represent the data.

Moving on to project 2 which is more on testing the relationship between different variables in a dataset. It is done to ensure if all the variables affecting one another or only several of them. Hypothesis testing are conducted which consists of hypothesis testing, correlation, regression, chi-square and analysis of variance (ANOVA). For example, our group has handled the dataset of clinical records with respondents who diagnose heart failure. There are different factors that affect the patients such as the level of their serum creatinine and history records in case they diagnose diabetes at the same time. Hence, from this we can conclude the objectives of the research and how it helps the us as the students to understand the dataset better.

In short, I certainly gain massive crucial knowledge that can help me to understand the graphs representation in my future career. I am very fortunate as well because all of my teammates are working very hard and we keep helping one another to accomplish the tasks given to us. We always do meetings throughout the weekend and discuss if any of us ever encounter any issues especially when coding the graphs using the R Programming. I hope that the knowledge I gain will benefit myself and also the society in the next semester or during my working career.