KHOR YONG XIN's Reflection

PSDA Project 2

The title for my Project 2 is A Study on Death in Malaysia 2018. The secondary data used is collected from the Department of Statistic Malaysia entitled is “Statistic on causes of death” for the year 2018. The sample size is collected from the number of death in 8 states in Malaysia while the population is the citizen in Malaysia. This topic is interesting to carry out few tests on certain claim. For instance, is the mean death of the man and mean death of the female is different, is the classification of death which is classified by the causes of death, and the states are independent, is there one of the probabilities of the types of death is different to other with all the probabilities are equals, is the relationship between the age and the number of deaths have positive linear relationship, is the relationship between age and number of death having linear regression or not.

In this project, I have learn a lot of methods of the test to be carried out on certain claim. It is very useful to use when I am dealing with a huge number of population. I can just performing those tests on the sample data collected to estimate for the population parameters. It will be more accurate when the number of sample data are greater than 30 because the standard error will decrease.

From the results, there are some claims that are rejected or insufficient evidence to support those claim. Hence, the result of supporting or not supporting of the claim is by referring to the test statistic as well as the critical value. When the test statistic calculated is lies in the critical region means it is rejecting the null hypothesis, however, when it does not lie in the critical region, it fails to reject null hypothesis. For some of the tests, it needs the degree of freedom to use for chi-square value and t-value. The degree of freedom takes a vital role because with the wrong degree of freedom, it will affect the value and makes the results to be not accurate. Thus, in my project, when calculation for the degree of freedom, I need to be very sure that I used the correct parameters in the formula.

Furthermore, as you can see, the number of sample used in the tests, some are different because the way I look and carry out the test is different. For instance, in hypothesis testing, the number of sample size is 8 because the data collected is from the 8 states as mentioned in the report. However, for the rest of the tests, the number of sample is 76704 which is the sample size from Malaysia.

For the correlation and regression analysis, I found out that if the conclusion that I get is positive relationship between both variables, how I can show more evidences or validate the conclusion is that, I calculate the correlation coefficient as well as the coefficient of determination. Both of them support the conclusion and hence, the results can be trusted. Below are the graphs for correlation and regression respectively.

                      c.png                   r.png

When cope with Rstudio, I just refer to the tutorial slides given by our lecturer Dr. Chan Weng Howe. The tutorial slides are very useful as it helps me and save my time when doing the coding. However, in the regression part, the tutorial slide does not provide enough information for the names of parameters in the console. The parameter that showed in r console are complicated for me and in the tutorial slide, it does not mention the representation of parameters in the r console. So, I discover them by referring to tutorial in YouTube. Lastly, for my presentation, I used PowerPoint to record the slide, make some animation to let my video more interesting and guide the viewer to look at what I am presenting. After done my presentation, I export it to mp4 so that our lecturer can refer easily.

In conclusion, the results from those tests for the claims, we can conclude that the mean number of death for male and female are the same, states in Malaysia and the classification of death are dependent, each types of death are not having the same proportion to be happened and the age and the number of death are having a strong positive relationship.

 

PSDA Project 1

Probability and Statistical Data Analysis in short PSDA is a course that study about statistic and probability. There are two types of statistics such as descriptive and inferential. In our first project, we are using descriptive statistic to present, organize and summarize our data. Data can be differentiate into two types which are primary data and secondary data. The data that we obtain for our Project 1 (A Study on Shopping Preference of UTM Students (Online or Offline)) is a primary data as all the data are collected by ourselves.

To collect the data from respondent, we use Google Form. Thus, by setting and organizing the questions using Google Form, I learned how to relate the types of questions and the types of data need to be collected that mentioned in our task given for instance, nominal, ordinal, interval and ratio. After finalizing the questions, the next things that we need to consider is about the types of graph that used to represent our data. This may include pie chart, bar chart, histogram and so on. This process take much time due to we need to think whether the graph representation is suitable to present our data. At here, we are very appreciate to our lecturer Dr. Chan Weng Howe as he spend his time to explain and give us opinion on our project.

Through this project, I’ve learn how to generate data using R Studio. R Studio is a very useful application especially for a Data Engineering student to summarized all the data needed and run the code to produce a plot. The problem I faced is that the R Tutorial is not enough for me to only refer to it, but I need to explore more on other resources especially on YouTube. Exploration more on YouTube helps me to write and run the code more quick and smooth when doing my project.

Furthermore, because of Movement Control Order (MCO), our presentation cannot be done in class. We can only do it through online or video recorded. So, what I found out is that we can record our video presentation through Power Point! When I heard about this from my teammate, it surprised me as I have use Power Point for so many years from secondary school until university. Thus, this method is chosen for our video presentation.

In conclusion, through Project 1, I have learn a lot of knowledge that I did not know before. I will practice more on this knowledge especially for R Studio because I as a Data Engineering student, statistic and data analysis are the core. I hope that through this subject, PSDA, it will help me to improve myself and become successful Data Engineer in the future.

Dinamika Malaysia

Dinamika Malaysia merupakan satu kursus yang mengenalkan negara Malaysia dari aspek politik, ekonomi dan sosial. Kursus ini menerangkan perkembangan sejarah Malaysia sejak zaman kesultanan hingga zaman moden. Kegitan imperialisme serta kolonialisme telah menjejaskan peradaban masyarakat di negara kita yang dibangunkan sejak sekian lama. Kesannya, munculnya golongan-golongan yang bangkit untuk menentang demi memperjuangkan kemerdekaan tanah air. Hal ini dapat menunjukkan semangat patriotism yang kuat dalam sanubari mereka.

Implikasinya, dengan mempelajari kursus ini, saya dapat melihatkan betapa pentingnya kita mencintai negara kita Malaysia supa tidak mensia-siakan tenaga serta korban dari penjuangan-penjuangan. 

Programming Technique 1

Programming is a fundamental basic for all computer science's students. Programming trains students to think critically when facing the problems to solve it. By learning Programming Technique, I can become a programmer and this may also help me to become a Data Engineer in the future

This course initially teaches us on the development of pseudo code, flowchart, tracing the results and later on write some programs to display the results. Besides, I've gained the knowledge on the function, array, pointers, structured data and so on. There are two types of function such as predefined functions and user define function while the two types of array are 1D array and 2D array.

Technology and Information System (Design Thinking)

My dream with regard to my course is to become the most successful and professional data engineer who may help my company to increase their analytic infrastructure, solve critical problems and so on. By using my advanced analytic and problems solving skills, I will find the simplest as well as the easiest ways for others to solve with their difficulties.  

Design thinking is said to be significant for me due to it helps to build high order thinking skill, communication skill, computer skill and so on. During the project discussion, we need to think for the reasons and solutions for the problems faced by students when doing programming. This may enhance our critical thinking skill in order to think creatively. Besides, it also helps to train our communication skill as we try to accept different opinions among members. Next, we can practice on the computer skill such as using Microsoft office, edit videos and so on.

In this 4 years of studying Bachelor of Computer Science (Data Engineering), I will take part in many activities or campaigns that may enhance my knowledge on critical thinking skill, communication skill and other soft skill. With these kinds of skills, I believe that I can be employed and become a white-collar data engineer in the future. Furthermore, I will practice the 5 process which are empathize, define, ideate, prototype and test when resolving the problems. As a result, I can solve the problem effectively and efficiently for the users’ needs.

Discrete Structure

Discrete Structure is the course that study the set, function, probability, graph theory and boolean algebra. In this subject I have made use of the set theory when dealing with programming to solve problems. Besides, functions and relation are also fundamental for me to study. Function is a type of relation with specific characteristic while relations are the way to associate objects of various set. Other than that, probability is importance when we face with the problems that involving in calculate how many ways to be implemented. This may help me a lot in my future as an Data Engineer. As a computer science student, we also need to understand and study the graph as it use in many situation to describe the relationship and algorithm. Lastly, many of the element in electrical and computer have the output of either 1 or 0. 1 can be meant by true or on while 0 can be meant by false or off. This is widely used in electronic circuit to produce the respective output.

Digital Logic

Digital logic is the study of the logic gates and memory elements. In this course, I’ve learn the types of gates and their function respectively for instance, basic gates such as OR, NOT and AND gates while for the universal gates such as NAND, NOR, XOR and NXOR gates. Universal gates is the combination of one or more type of basic gates. The use of universal gates can reduce the cost as we have to stock only one type of gates and can be reuse to implement other gates.

Besides, for the memory elements, there are two type of temporary storage device such as latch and flip-flop. There are three types of latches for example S-R, Gated S-R and Gated D latch. However, there are four types of flip-flop such as S-R, D, J-K, T flip-flop. The different between latch and flip-flop is the method used for changing their state. Meanwhile, their similarity is both of them are bi-stable.

Science and Technology Thinking

Dari segi kandungna utama kursus Pemikiran Sains dan Teknologi, tanggapan saya ialah kursus ini memainkan peranan yang signifikan untuk memajukan pemikiran manusia serta sains dan teknologi. Seterusnya, kursus ini juga menerangkan pekembangan ilmu sains dan teknologi dari zaman dahula sampai ke era globalisasi ini. Pada pandangan saya, seseorang haruslah berpegang teguh kepada agama masing-masing untuk mendapatkan ilmu pengetahuan yang bermanfaat.

Kursus Pemikiran Sains dan Teknologi berkait rapat dengan bidang pengajian saya iaitu Data Kejuruteraan kerana bidang pengajian saya memerlukan ilmu tentang sains dan teknologi untuk saya aplisasikan dalam penyelesaian masalah yang berkaitan dengan pengaturcaraan, analisis dan sebagainya.

Saya berjangka bahawa kursus ini dapat melahirkan mahasiswa yang mempunyai nilai dan etika selepas memahami kepentingan kursus Pemikiran Sains dan Teknologi. Selain daripada itu, subjek ini berupaya memupuk mahasiswa supaya dapat menggunakan teknologi dengan berfikir secara kritikal dan rasional semasa menghadapi masalah yang serius.

Sumbangan sains dan teknologi ialah insan yang kreaktif dan inovatif dapat dilahirkan untuk mencipta sesuatu yang lebih mudah kepada pengguna untuk mengaksesnya. Implikasinya, tugasan-tugansa dapat diselesaikan secara efektif dan efisien. Selain itu, sains dan teknologi dapat memajukan sistem maklumat yang banyak digunakan oleh manusia tidak kira umur, bidang pekerjaan dan sebagainya dalam dunia ini.