Statistical
Inferences and Creative Thinking
Creative
Thinking And Statistics
|
Syed
Imtiaz Ahmad
|
17/09/2002
|
How
do we use statistics in thinking more creatively about the things we
see or record? After all, one may say that statistics is about
calculating, describing, manipulating and interpreting mathematical
attributes of sets or population. This may be viewed more as
formalizing rather than cultivating creative thinking. Before we
raise more questions about statistics, let us articulate a little
about creative thinking. Let us simply not look into creative
thinking but also constructive thinking, critical thinking, and so
on. What is creative thinking? This question is asked and answered
variously ever so often. We are raising it again to refresh our
thoughts and to examine the role of statistics in creative thinking.
Creative
thinking is about looking at possibilities in understanding objects
or phenomena. It generally involves describing objects or phenomena,
making projections on what is likely to happen in observed objects
or phenomena with movements in time and space, and taking actions
about objects or phenomena in order to move them in a desired
direction. Here, the word objects is used to refer to things that
interest us. Phenomena are what we experience about things that
interest us, not necessarily derived from what those things may be
within. The word creative in creative thinking implies emphasis on
looking at possibilities. The word constructive in constructive
thinking implies emphasis on building solutions from the
possibilities, and the word critical in critical thinking implies
emphasis on analysis of the possibilities and questioning the
interpretations based on the possibilities. Often, we may use
creative thinking, constructive thinking, and critical thinking as
one and the same, keeping in mind the emphasis if there is one.
Let
us do a little experiment. After all statistics is often about
experiments and drawing conclusions from data gathered in
experiments. In that sense, we may say that statistics is about
discovering stories that numbers may tell. Consider, for example, a
business is interested in knowing how its employees feel about some
corporate values that it espouses. Let us say that this value is,
“We take pride in our teamwork.” There are twenty employees.
They are asked to rate this value statement in one of the following
categories:
Strongly Disagree,
Disagree,
Neither Agree nor
Disagree,
Agree, or
Strongly Agree.
In order to ‘simplify’ matters, the above choices are assigned a
numeric value ranging from 1 to 5 i.e.
|
Choice
|
Numeric
Value
|
|
Strongly
Disagree,
|
1
|
|
Disagree,
|
2
|
|
Neither
Agree nor Disagree,
|
3
|
|
Agree,
or
|
4
|
|
Strongly
Agree.
|
5
|
The
responses from the 20 employees in the business are recorded below
as numeric scores for convenience:
5
3 5 5 2 3
3 5 1 5 4
4 2 4 5 4
1 4 2 3
(1)
What can we discern from these responses? Do they support, “We
take pride in our teamwork,” the value espoused by the business?
We can draw some conclusions in browsing the above scores directly.
However, we can try to be a little more creative and arrange the
score in a way that makes us easy to see the pattern and draw
conclusions. Here are the same scores arranged according to the
frequency of their occurrence:
Score
1 2 3
4 5
Frequency 2
3 4 5
6 (2)
We have opened some new possibilities in looking at the scores. We
can see easily that more than half the people agree or strongly
agree with the stated value, “We take pride in our teamwork.”
One fifth neither agree nor disagree with the stated value, and one
fourth disagree or strongly disagree with the stated value. In
re-arranging the scores as shown, we reduced the number values for
study into five groups.
Handling
five things in drawing conclusions is a lot easier than handling
twenty things. This is simply in the observed nature of human
beings. This observation on human beings is based on an empirical
study of individuals and organizations by a team consisting of a
psychologist and a mathematician. Their study produced a heuristic
called “7 plus or minus 2” rule. It means that mental skills
work best when we deal with five to nine things that require our
attention at the same time.
Attending
to less than five things at a time, our mental skills may be under
utilized, and attending to more than nine things at a time we may
not be able to attend to all of them well. The numbers five to nine
presumably are related to the capacity of short-term memory in
humans - that part of our memory where we retain things for ready
recall when engaged in a task. Creating piles of things and
categorizing things into objects is what we do all the time. For
example, we may categorize fruits into apples, oranges, and so on
rather than talking about them as a single category of all fruits,
just as we have tried to do in the above example of scores. Rather
than working with twenty individual scores, we arranged them into
five categories corresponding to the five categories of responses.
It then became easier to talk about what the scores may tell us
regarding the espoused corporate value, “We take pride in our
teamwork.”
We
could keep on analyzing the scores further in their numeric form.
However, it is often helpful to visualize what is happening, or the
implied patterns of things, when we show them in a pictorial form.
We may graph the numeric data into a picture. One such pictorial
representation
is shown in Figure 1, corresponding to the numeric data in (1).
Figure 1: A histogram for
scores in (1). This is a bar graph in which the area (height) of
each bar is proportional to the frequency of values for a particular
category of data. The horizontal scale shows the categorization
scheme.
The
graphical view often allows us to perceive the patterns contained in
numeric data more easily. For example, looking at Figure 1, the staff
members in this business are not divided evenly between those who
strongly disagree with the value statement, "We take pride in our
teamwork," and those who strongly agree. Also, one may wonder why
as many as four persons out of twenty, a significant number neither
agree nor disagree with this statement. Is it because they do not
quite understand what the statement says, or that they are simply not
interested in the state of affairs? There is some food for further
thought. Five out of twenty persons disagree, and two of them
strongly, that the statement, "We take pride in our
teamwork," reflects the true state of affairs in the business.
Taking
the raw data shown in (1) and arranging it as shown in (2) or Figure
1, did open the possibilities to deal with the value statement,
"We take pride in our teamwork," for this business. It
opened the doors to creative thinking, it allowed critical thinking by
facilitating analysis, and in at least some small ways, it allowed the
business to constructively think about the possibilities.
The
data in (1) includes all employees. Therefore, we can be quite
confident that our conclusions reflect the true state of affairs for
the entire business as seen from the perspective of employees. What if
the number of employees was very large, and it was considered neither
necessary nor practical to engage all employees in responding to the
value statement, "We take pride in our teamwork." What if we
simply took the responses from a reasonable sample of employees and
drew conclusions as if they were based on the responses of all
employees? This is a new possibility worthy of consideration. However,
would we place a lot of confidence in our conclusions if we work with
this new possibility? Well, it all depends on how representative of
all employees is the sample of employees we chose? To raise questions
when something new enters in what we may have been doing
unquestionably before is part of creative and critical thinking. We
will deal with the issues of taking a sample and making conclusions
about the population from which the sample is derived, in later
sections. For now, we will say that the sample of values represents a
random selection of values from the population i.e. it is free of bias
in favor or against any specific segment of population.
Sometimes,
we may be interested in very simple categorizations. For example,
which way did the majority respond? Did they tend to disagree and
strongly disagree or agree and strongly agree? If we assume a center
between those who agree and those who disagree then which side weighed
more heavily? We can learn this and other interesting features in the
scores by simply arranging them in increasing or decreasing order of
magnitude. The sorted scores in increasing order are:
1
1 2 2 2 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 5 (3)
Separating
at the mid point, we may rewrite (3) as follows:
1
1 2 2 2 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 5 (4)
More
than half the persons responded with agree or strongly agree. The
value at this mid point is 4. In statistics, it is called the median.
For an even number of observations, the median is the average of two
values in the middle. The median of a set of observations is an
indication of central tendency. When these observed values are
arranged in increasing order of magnitude, half of the observations
will be less than this value and the other half will exceed this
value. The median value of 4 for the scores means that the majority
agreed or strongly agreed with the statement, "We take pride in
our teamwork."
There
is another statistic, called mode, noting something of interest about
observed values. It is a singular categorization picking up the value
that occurs most frequently. For the observations listed in (1), the
mode is 5, meaning that more persons responded by strongly agreeing
with the statement, "We take pride in our teamwork" than
those responding in any other single category.
Both
median and mode provide insights in what we observe. They are
statistics and they require calculations or manipulations in order to
gain insights into what we observe. The focus is not on calculations
or manipulations. They have to be done. The focus is on gaining
insights into what we observe, and this in some small ways is always
likely to assist in thinking creatively, constructively, or critically
about what we observe.
The
sorted arrangement also helps us spot two other simple statistics in
the given data i.e. the minimum and maximum values.
Now
turning to another possibility. What if the recorded values were not
from a discrete set of possible integers such 1, 2, 3, 4, and 5 in the
preceding example? Rather the values may take on fractions such as 1.3
or 2.5. Consider the following example of height measurements in feet
and fractions of feet for a sample of twenty people:
5.5,
6.5, 6, 5.8, 6.1, 6.2, 5.6, 6.6, 5.4, 5.9, 5, 6.2, 6, 6.7, 4.8, 7.1,
5.3, 5.8, 6, 5.9 (5)
How
do we put these values into a small number of categories? As an
example, we may define them as follows:
Category
#: 1 2 3 4 5 6
Height
5 5.5 6 6.5 7 7.5 (6)
First
category gathers all values less than or equal to 5 in height. We may
say that this covers the range from 4.5 to 5, with mid point at 4.75.
The second category gathers values over 5 and those less than or equal
to 5.5, with mid point at 5.25, and so on.
The
values listed in (5) and categorized according to (6) produce the
following results:
Height
5 5.5 6 6.5 7 7.5
Freq.
2 3 8 4 2 1 (7)
The
graphical representation of these results is shown in Figure 2.
Figure
2: Graphical representation of data listed in (5) using categories (6)
In
evaluating the results shown in (6) and Figure 2, we may draw many
conclusions as before. But these conclusions may apply only to the
sample. Their generalization to the population from which the sample
is drawn is well warranted for a good sample. However, these
generalizations may have to be qualified with the degree of confidence
we may be able to associate with them. For example, the sample
indicates clearly that majority of people measured are nearly six feet
or taller. How confident are we that this represents the true state of
affairs for the population from which the sample measurements have
been taken? This leads to questions about the sample itself. For
example, does the sample appear to be typical or normal for the kind
of thing we are measuring? We know that for a measurement like the
heights of people, there some perceived middle or average value that
typifies most people. As we move away from this middle or average
value, the number of people having those heights would diminish. The
pattern of diminishing values away from the middle value represents a
tendency in nature, through some mysteries in nature. The resulting
pattern is called a bell curve.
The
frequency distribution data in (6) and Figure 2 is redrawn in Figure
3, showing a curve that results from connecting the top points of all
frequency value bars in Figure 2. It is indicative of a bell shape,
although
It
is not a very smooth bell curve. The ‘quality’ of this bell shape
determines, in ways, the quality of our generalizations. A better
understanding of this issue will be developed in later sections.
Figure
3: Frequency distribution of heights shown as a curve indicating the
peak and tapering of the values away from the peak.
1.
Introduction
2.
Creative Thinking and Statistics
3.
Raw Data And Data Aggregations By Categories
4.
Measures Of Central Tendency
5.
Assessing Sample Values On The Basis Of Sample Statistics
6.
Conclusions
7.
Cited References
