Monday, December 9, 2019

Probability and Statistics Educators †Free Samples to Students

Question: Discuss about the Probability and Statistics Educators. Answer: Introduction: As per the 200 sample drawn, simple random sampling has been used as it is believed to eliminate biasness in the population and gives an equal chance to each and every individual to be represented in the sample. Moreover, ransom sampling is also considered one of the widely and popular method of choosing the sample (Brymen Bell, 2015). However, as per my opinion it is considered to be the best method because it is suitable for proper representation of large samples as well as provides ease of use (Johnson Wichern, 2014). In addition, it is based on mathematical concept of probability and is not much extensive of the detailed information. Alcohol Meals Fuel Phone Mean 1092.39 1067.13 1790.73 1348.785 Mode 0 0 0 1200 Median 522 720 1440 1080 Standard Deviation 1372.434 1332.688 1587.192 1245.331 Standard Error 97.04572 94.23528 112.2314 88.05819 Sample Variance 1883574 1776057 2519179 1550849 Range 10428 9600 10320 8400 Minimum 0 0 0 0 Maximum 10428 9600 10320 8400 Skewness 2.360175 3.060723 1.82877 3.11862 Kurtosis 10.24997 12.4977 5.042528 13.6592 Confidence Level(95.0%) 191.3699 185.8279 221.3155 173.6469 Sum 218478 213426 358146 269757 Table 1.1: Descriptive Statistics of Variables The appropriate measure of variation that could be considered in this case is standard deviation because all the variables are the expenditure incurred. Moreover, standard deviation helps in studying the variability in the data (Cressie, 2015). As per the data, fuel shows the maximum deviation which is 1587.19 AUD of the annual expenditure followed by alcohol and meals and the least deviation in annual expenditure on phone that is 1245.33 AUD. In addition, this shows the variability experienced from the mean which further depicts the distance from the mean (Ravid, 2014). However, less deviation indicates that there less fluctuation in the amount of expenditure done (in this case) and vice versa. The box plots as given part 2 of the question highlights that the variability is maximum in fuels followed by alcohol from the upper quartile and median. Moreover, as per the expenditure on the data distribution, the data has been shown to be less in lower quartile than in upper quartile (Hahs-Vaughn Lomax, 2013). Also, through descriptive statistics, for all the four variables that is alcohol, meals, fuels and phone, the mean median mode. This depicts the data is slightly higher in the end tail (positive skewed / upper quartile of box plot). Conversely, it underlines annual expenditure on these four highlighted variables is high. Classes Frequency Percentage Cumulative % 0-400 25 12.50% 12.50% 400-800 51 25.50% 38.00% 800-1200 54 27.00% 65.00% 1200-1600 28 14.00% 79.00% 1600-2000 13 6.50% 85.50% 2000-2400 13 6.50% 92.00% 2400-2800 6 3.00% 95.00% 2800-3200 5 2.50% 97.50% More than 3200 5 2.50% 100.00% Table 2.1: Frequency distribution of the expenditures onUtilities The interpretation of histogram emphasizes that the sample on annual expenditures on utilities is not normally distributed because for normal distribution. This is primarily because in normal distribution mean = mode = median (Mendenhall, Beaver Beaver, 2012). On the contrary, this household data on expenditure can be mathematically depicted as Mean on Utilities 1233 Median on Utilities 1000 Mode on Utilities 1000 Table 2.2: Central Tendency Application on Utilities Therefore, Mean median = mode, which shows variation in data highlighting positive skewness (Corder Foreman, 2014). However, on histogram the data is accumulated on the left side of the histogram (Refer Task 2 of excel sheet). The same can be illustrated through a histogram below. The ln(texp) againstln(ataxinc) is explained using a scatter lot as shown below in Figure 3.1. Moreover, the scatter is in form of upload sloping and scattered at one place only. Also, the correlation between the log of after tax annual income and total expenditure is r = 0.994513 highlighting the relationship between both the variables is high as the r is between 0.5 and 1 which shows strong and positive correlation. Highest Degree Gender Female Male Bachelor 20 15 Intermediate 18 32 Master 15 16 Primary 16 23 Secondary 23 22 Total 92 108 Table 4.1: Contingency Table The contingency table explains that higher education for the males as well as females. The higher level of education accompanies bachelors degree and masters degree. However, as per the data females count is (15+20) 35 whereas for males is (16+15) 31. This depicts that households have stark contrast in higher level of education amongst the males and females. Highest Degree Gender Female Male Total Bachelor 0.100 0.075 0.175 Intermediate 0.090 0.160 0.250 Master 0.075 0.080 0.155 Primary 0.080 0.115 0.195 Secondary 0.115 0.110 0.225 Total 0.460 0.540 1.000 It can be seen that Pr (A)*Pr (B) Pr (AB) However, it can be further concluded that the two variables Gender = Male and Level of Education = Masters Degree are not dependent in nature. Moreover, their probabilities not being equal make them dependent. References Bryman, A., Bell, E. (2015).Business research methods. Oxford University Press, USA. Corder, G. W., Foreman, D. I. (2014).Nonparametric statistics: A step-by-step approach. John Wiley Sons. Cressie, N. (2015).Statistics for spatial data. John Wiley Sons. Hahs-Vaughn, D. L., Lomax, R. G. (2013).An introduction to statistical concepts. Routledge. Johnson, R. A., Wichern, D. W. (2014).Applied multivariate statistical analysis(Vol. 4). New Jersey: Prentice-Hall. Mendenhall, W., Beaver, R. J., Beaver, B. M. (2012).Introduction to probability and statistics. Cengage Learning. Ravid, R. (2014).Practical statistics for educators. Rowman Littlefield.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.