Project 2 Probability and Statistical Data Analysis by Robby Samudra A19EC0279
Introduction
Education is a very important aspect of life. To advance civilization and improve the quality of life, education is an aspect that must be prioritized. Therefore, I want to learn how the of the quality of education in every country in the Southeast Asia Region. In this research I am more focused on the government's concern for education in financial terms. However, I also noticed how the pattern of number students and teachers.
Methodology
Dataset was collected from platform called data.un.org and also from (http://statweb.stanford.edu/~sabatti/data.html). The data provided number of students enrolled in three different levels, teachers enrolled in three different levels, also percentage of expenditure on education based on GDP and Government’s expenditure. The target is all 10 countries in Southeast Asia region. The statistical analysis test in this project uses hypothesis testing 2 samples, correlation, regression, and test of independence chi square.
Hypothesis testing 2- sample
Due to the number of students registered at the primary level is greater than the secondary level. I want to know that with more students enrolled at the primary level compared to the secondary level, there will be more teachers at the primary level. Although at the secondary level the area of teaching expertise is more than the primary level. Therefore I want to determine if there are more teachers at the primary level compared to teachers at the primary level with 5% significance level. The total amount of data from 10 countries in 4 years is 40 for both variables (primary and secondary teachers)
Let µ1 = Number of teachers in primary is bigger than teachers in secondary level
Let µ2 = Number of teachers in primary is not bigger than teachers in secondary level
H0 =µ1-µ2=0
H1 = µ1>µ2
Figure 1: R calculation hypothesis 2-sample
H0 will be rejected if Z0 > Z0.05
Variable X indicates number of teacher in primary level on the other hand variable Y indicates number of teachers in secondary level. Since Z0 > Z0.05 we rejected H0 at 0.05 significance level. there is sufficient evidence that concludes number of teachers in primary level is bigger than teachers in secondary level.
Correlation
Figure 2: R calculation for correlation
The above is the correlation between expenditure on education based on GDP and Government expenditure. Based on the above values between expenditure on education based on GDP and Government expenditure is 0.69926635. we can conclude that there is a moderate relationship between expenditure on education based on GDP and Government expenditure. After we look at this and the results of the correlation between the two, I can conclude between the two that there is a strong relationship which is if there is a change in one variable that can be associated with another change.
Graph 1: Scatter plot of % expenditure on education based on GDP and Government expenditure.
From the graph we can see expenditure on education based on GDP and Government expenditure. Because here I am using a 1-6 ratio that is low to high. But we can see the level of expenditure on education based on GDP and Government expenditure and show a correlation relationship.
Regression
Figure 3: R calculation for regression
Graph 2: Scatter plot of students in secondary level and students in primary level
Regression analysis is used to model the relationship between response variables and one or more predictor variables. It can be seen from the graph above that it is a simple regression model involving the response variable y (students in secondary level) and the predictor variable x (students in primary level). because of that he shows the equation ŷ = 32.4515 - 0.6809x, because b0 and b1 are positive and negative there is a positive influence on X1 and a negative effect on X2. Value β0 = 32.4515 which is the intersection coefficient shows the range of .β 1 = 0.6809x shows a increases number of students in primary level average for each of students in secondary level by all countries . R2 = 0.9673013 is shown as a moderate relationship between students in secondary level and students in primary level.
Independence of Chi-square
Chi-square independence test is a procedure for testing if two categorical variables are related in several populations. The 40 samples taken, I want to know any relationship between number of students and teachers in 5 different years.
H0 = No relationship between variables
H1 = Variables has relationship
Figure 5: R calculation for analysis chi-square 2
Based on the chi-square value obtained, the value of X-squared X2 = 64.9964 and its critical value is equal to X21,0.05 = 7.814728. since X2 > X21,0.05 we reject null hypothesis and conclude that there is relationship between students and teachers while in 5 different years.
Conclusion
From all of the finding through this analytical study, I can state some conclusion. First, number of students and teachers has a strong relationship. For example, number of students in primary is bigger than number of students in secondary, therefore number of teachers in primary will also bigger than number of teachers in secondary. Second, the correlation between the percentages of expenditure on education by government and GDP have a strong relationship. If the expenditure on education increase then the percentage on both of them will increase. Third, number of students from primary to tertiary level decrease with big difference. It shows us that there is still a lot of students out there who not lucky enough to continue their studies and each country need to take action on this problem. 4. Last, when the expenditure on education increased the number of students enrolled will increased which means we need to spend more for education in addition to uplift the quality of the education.