Your Mail

ÚŃČí

 

Counseling:

Ask the Scholar

|

Ask About Islam

|

Hajj & `Umrah

|

Cyber Counselor

|

Parenting Counselor

 



Critiques and Thought | Islamic Themes | Human Condition & Social Context | Scientific Domain | Interfaith, Intercivilizational & Intercultural | Interviews, Reviews and Events


Statistical Inferences and Creative Thinking

Measures Of Central Tendency

Syed Imtiaz Ahmad

17/09/2002

We begin by raising issues and asking questions about phenomena in nature - things that are not necessarily man-made although man may have contributed positively or negatively to the phenomena. For our purpose, phenomena are things that we observe in time and space e.g. the amount of rainfall, people dying of accidents, incidence of certain diseases, the performance of students in classes, and so on. A thoughtful question to ask whether there is an observable pattern or perceived nature of things in the phenomena. Why should we look for patterns or perceived nature of things in phenomena? From the thinking perspective, it is simply to gain insight into what happens. For example, it may be simply comforting to know the pattern that rainfall follows in a given region as we move in time through the year, or to know the pattern of rainfall as we move in space from region to region.

Knowledge brings comfort. Knowing what may happen next, in time or space, is at least comforting even if we are not able to do much about it. Knowledge is also a potential source of power. We can use the knowledge to be in the right place at the right time, improve the quality of life or make financial gains or avoid financial losses. In addition to discovering useful features of phenomena and making use of these features in time of need, we may also examine whether certain phenomena show a cause and effect relationship.

Once a pattern is discovered, it can be improved further with repeated observations, and used as a basis to make inferences about new observations.

Experience has shown that the observed values of features in many phenomena have a certain tendency i.e. the values tend to cluster around what may be called a mid-point and gradually disperse away from this mid-point in decreasing number. This is called the central tendency. Of course, we assume that what we are observing is random i.e. it is based on the nature of things that happen on their own rather driven by some systematic force aligning the values in its favor. The average of all observed values, also called the mean, is a measure of central tendency in that it identifies a mid-point. However, without knowing how the values disperse away from this mid-point, the mean value itself may not serve much useful purpose. The dispersion or the variability of the values is calculated and named as variance. Mean and variance are common measures of central tendency and variability in observed values. These are statistics of common choice when the observed values behave as discussed in this paragraph, and the sample of values is randomly drawn so as to make it representative of the population at large.

Let the random sample of data values be denoted as x1 , x2 , x3 , ..........., xn for n values of data. Then the mean value is calculated as:

mean = (x1 + x2 +x3 + ........... xn )/n       (8)

We may also denote mean as

mean = 1/n? xi for I=1,2,3, ......, n       (9)

Applying equation (1) or (2) to the data in Table 1, the calculated value is:

mean = 55.723, or

mean = 55.73 (approximate but a better choice)       (10)

Equations (1) and (2) can be easily adopted for the summarized (frequency distribution) data in Table 2, as described below.

mean =(f1x1 + f2x2 +f3x3 + ... fkxk )/(f1 + f2 + f3 +... (fk)
= (f1x1 + f2x2 +f3x3 + ... fkxk )/n       (11)

or
mean = 1/n?fj xj for j=1,2,3, ......,k       (12)

Using (11) or (12), the calculated value is:

mean = 55       (13)

Note that the value in (13) differs only marginally from the value in (10). Generally, classifying raw data into frequency distribution form has only marginal effect on the calculated statistics.

What is important here is not to simply calculate the mean but to determine how usefully it characterizes the sample values and the population that the sample represents. Computational aids for statistics are within easy reach via a calculator or through software packages in Microsoft Windows environment. However, the relative ease with which these statistics can be derived should not lead to their thoughtless and misleading usage.

For the data in Table 1, the calculated mean in (10) is 55.73. Let us simply use mean=55 in our discussion. How well does this mean value characterize data shown in Table 1. Can we use this mean value to say that majority of students are scoring or near 55? Of course, we can answer this question by using the methods discussed in the previous section. Here, we would like to draw inferences by using the means and the related statistic of variance. In addition to using the mean in characterizing the sample, we are interested in finding whether we can do some generalizations. After all, the purpose of statistics is not simply to calculate a statistic. It is more to see if we can draw broader inferences from what we have calculated. In addition to asking whether a sample mean characterizes the population from which the sample is drawn, we should also ask whether the mean we have calculated would repeat its value or very nearly repeat its value if we recalculated it by taking many more samples.

We know that for data on the entire population, the calculated mean value is precise. For sample data from the population, the calculated mean is only an estimate of the population mean. It is a measure of the population mean if the sample is random in character. A sample is called random if every member of the population has an equal chance of being included in the sample and each selection of data is made independently of all others.

How good is a sample mean in terms of the likelihood that most values encountered in the population would be in proximity of this mean? In order to find an answer, we develop a measure of variations from the mean value. Let us start by determining how sample values deviate from the calculated mean. We calculate the difference of each sample value from the mean. The differences or deviations will take both positive and negative values depending on whether a sample value is larger or smaller than the calculated mean. If we sum up the differences, the positive values would cancel the negative values producing a result that does not represent the accumulated difference from the mean but something quite different altogether. The accumulated difference can be found by squaring each difference from the mean, summing them up, taking the average, and finally taking the square root to have a value that can be compared to the mean. This is called the standard difference or deviation. Squaring the differences and summing them up is easily understood intuitively. The mean value of sample variance is obtained by dividing the accumulated difference with the number of values, n, in the sample. However, the mean value of the variance for the population is obtained by dividing it with n-1. Why use n-1 and not n? We may view it as compensating the answer for using sample data as opposed to data for the entire population. Given n sample data values, n-1 denotes the degrees of freedom. The degrees of freedom are derived by taking the total number values in the sample and reducing it by an amount equal to the number of restrictions placed on calculations from sample data values. For calculating the mean value of variance, the degrees of freedom is one less than the number of calculated differences, assuming that once we have found n-1 differences, the nth difference cannot be freely calculated. It is fixed for a population. The calculation of standard deviation may now be described as follows:

Variance (of differences from the mean),
var = 1/(n-1)?( xi - mean )2 for I=1,2,3, ….., n       (14)

The variance is often calculated more easily and with lesser potential impact of round-off approximations in calculations as:

varx =(sumsq-sum*sum/n)/(n-1)       (15)

where sumsq is the sum of the squares of given data values, sum is the sum of given data values, and n is the number of data values
Equations (14) and (15) can be easily modified for data represented in the form of frequency distribution.
Standard deviation is then calculated by taking the square root of variance i.e.

stdev = sqrt (var)        (16)

The calculated value of standard deviation for the above data is 19.28. Standard deviation is an indicator of how much the sample values may vary from the mean. How do we use this value of standard deviation in speaking about the mean value we have calculated? Can we say that the mean values we may find when we take more and more samples from the population would be near the one we have calculated? In other words, would the mean value remain stable from sample to sample? Can we come up with a statement of confidence about the calculated mean value versus the mean values from other samples? In order to answer these questions, we introduce a measure of variation for the sample mean. It is called standard error of the mean and calculated as follows:

standard error of the mean,
stderr = standard deviation/sqrt(n)        (17)

A small value for standard error of the mean implies that the calculated mean will not differ much from means calculated with other samples from the given population. This is a significant advance in that we can generalize the mean calculated from the given sample to mean values from other samples drawn from the same population. If we assume a very large, almost infinite, number of such samples and their mean values being near the same as the one just calculated then this calculated mean may in fact be claimed as very nearly the true mean of the entire population.

1. Introduction

2. Creative Thinking and Statistics

3. Raw Data And Data Aggregations By Categories

4. Measures Of Central Tendency

5. Assessing Sample Values On The Basis Of Sample Statistics

6. Conclusions

7. Cited References  

Contemporary Issues


Critiques and Thought | Islamic Themes | Human Condition & Social Context | Scientific Domain | Interfaith, Intercivilizational & Intercultural | Interviews, Reviews and Events


Send Mail

News | Shari`ah | Health & Science | Politics in Depth | Reading Islam | Family | Culture | Youth | Euro-Muslims

About Us | Speech of Sheikh Qaradawi | Contact Us | Advertise | Support IOL | Site Map