Your Mail

ÚÑÈí

 

Counseling:

Ask the Scholar

|

Ask About Islam

|

Hajj & `Umrah

|

Cyber Counselor

|

Parenting Counselor

 



Critiques and Thought | Islamic Themes | Human Condition & Social Context | Scientific Domain | Interfaith, Intercivilizational & Intercultural | Interviews, Reviews and Events


Statistical Inferences and Creative Thinking

Raw Data And Data Aggregations By Categories

Syed Imtiaz Ahmad

17/09/2002

Sometime the raw data is simply overwhelming. We may not be able to deal with it without suitable aggregation by categories even if those categories are not obvious, as may be seen in (1) and (5) described previously. As before, these aggregations can help us to see underlying patterns more easily. Consider the example of 40 data values shown in Table I. One may say that 40 values of some data need not be overwhelming in many situations. That is very true. We are not looking at the number 40 but whatever number is found to be overwhelming in a given situation. Sometimes, it is also a question of effectiveness. If a reduced set of things is likely to lead us to the same conclusions as a large set then why not conserve our resources in measuring, handling, and computing by using the reduced set. We may also be more efficient in our work and more effective in conveying it.

Figure 3: Frequency distribution of heights shown as a curve indicating the peak and tapering of the values away from the peak.

Table 1: Raw Sample Data Values

67 53 83 69 33 17 39 74 85 41 35 70 58 49 42 43 62 40 63 21

66 79 76 27 60 51 70 96 60 71 90 37 56 45 48 54 25 75 55 44

This data represents examination scores in a class. However, the same data may represent a random sample of annual income of people (in thousands) for some population, or their ages, or the number of highway accidents in various regions or over a period of time, or the number of people with aids disease in various cities, etc.
Let us summarize this data into categories or classes as shown in Table 2.

Table 2: Summarization of Raw Data into Intervals and Frequencies (Frequency Distribution)

No.

Class Interval

Midpoint

Frequency

i

 

mi

f

1

11-20

15

1

2

21-30

25

3

3

31-40

35

5

i

 

mi

f

4

41-50

45

7

5

51-60

55

8

6

61-70

65

7

7

71-80

75

5

8

81-90

85

3

9

91-100

95

1

This summarized data, or the frequency distribution of raw data values, appears to be more meaningful in form. It shows that most values fall within the range 31 to 80, the highest occurring values are around 51-60, with very few values at the low or high end.

The frequency distribution is often displayed pictorially as shown in Figure 4.

The frequency distribution graph in Figure 4 shows the pattern of distribution for data values in a more noticeable form. This point has also been discussed in the preceding section. In Figure 4, we see that most of the values are clustered at the middle, and that the midpoint value, in this case, is in the interval 51-60 (despite any apparent misalignment of the horizontal scale in the figure as drawn). The value at this central point is the midpoint of 51-60 i.e. 55. This is called the mean value.

What is the significance of this mean value? How can we use it to think and articulate about the characteristics of the measured scores for the sample and the general population? We will deal with these questions in the next section. For now, we may make the following observations. If the frequency data plot takes the form of a sharp bell curve then the sample mean value is a very good estimator of population mean value. If the bell is stretched out then the mean value is likely to change considerably from sample to sample and thus the mean value by itself cannot be a very good indicator of the population mean. If the bell is not symmetrical on its sides then the sample median may become a better indicator of the population mean. We say that the bell curve is negatively skewed if it is slow in rising from the start to its peak and positively skewed if it falls slowly down from the peak to its right.

As mentioned earlier, the shape of frequency distribution for many phenomena takes the form of a bell curve i.e. a frequency distribution with a peak in the middle and tapering off on both sides symmetrically. This is called a normal distribution.
The question is how normal is a situation that looks quite normal? We may visualize the top points of the bars in Figure 4, when connected together, produce a good bell shape, although in this case it appears to be a little stretched out. We will also examine how the shape of this bell may influences us in making abstractions or generalizations from the given values of data.

Figure 4: Frequency Distribution Graph for Data in Table 1 and 2  

The bibliographic references on statistics as well as the Qur’an and Sunnah provide very useful material on discerning patterns, and reflecting on patterns to develop insights on situations that may appear to be different on the surface but largely rest on the same basic foundations. This is discussed elsewhere in the work related to this paper [Ahmad et al, 1997].  

Returning now to the data in Table 2, we may notice that the reduced form of data lacks precision seemingly contained in the original data as listed in Table 1. However, in most practical cases this loss of precision does not affect resulting statistics in any significant way. The question is whether not expressing something precisely is undesirable. The answer in many cases is no. First, the precision used in recording measurements may itself be misleading. For data in Table 1, when a number such as 67 is recorded, how sure are we that it could not be 65 or 69 or something like it. For this particular set of score values, there may be have been limitations which make the score of 67 no more a true reflection of students performance than say 65 or 69. We may be on much safer ground if we make the score somewhat fuzzy, indicative of a range of values rather a single value. This possibly what happens when we assign letter grades to numeric scores recorded for students. Let us say that we are recording temperature values in a particular place in our home

The measuring instrument may not be very precise so that any value we record is suspect but we may be more certain if we record a range given the known limitations of the measuring instrument used. The problem is further compounded by that fact that the same type different instances of measuring instrument may not record exactly the same value. These remarks aptly to a student's work being graded by different instructors.

It is highly unlikely that different instructor, even teaching the same material to different sections of a course would give the same numeric score for a given piece of work.

Making fuzzy statements about situations may often be more accurate and meaningful than precise statements. Fuzzy values and how to draw inferences from fuzzy values is an area that we will take up elsewhere.

An equally important consideration is the issue of judicial use of the resources. Let us say that we are given a thousand items, and somehow we are able to select a representative sample of say 20 or fewer items. If we are able to work with this small sample to draw whatever inferences we wished as well as we could do one thousand, then working with a few makes both efficient and effective. Efficient because less resources and energy will be used in working with a small sample. Effective because it is a lot easier to grasp and articulate with a small number rather than a large number of items. Small may turn out to be quite beautiful in many situations.

1. Introduction

2. Creative Thinking and Statistics

3. Raw Data And Data Aggregations By Categories

4. Measures Of Central Tendency

5. Assessing Sample Values On The Basis Of Sample Statistics

6. Conclusions

7. Cited References  

Contemporary Issues


Critiques and Thought | Islamic Themes | Human Condition & Social Context | Scientific Domain | Interfaith, Intercivilizational & Intercultural | Interviews, Reviews and Events


Send Mail

News | Shari`ah | Health & Science | Politics in Depth | Reading Islam | Family | Culture | Youth | Euro-Muslims

About Us | Speech of Sheikh Qaradawi | Contact Us | Advertise | Support IOL | Site Map