Likert scale

1

SCALES and LEVELS of MEASUREMENT

The measurement of variables provides the means by which a researcher collects and categorizes the
attributes, attitudes, behaviors, and other phenomena for each unit of analysis into meaningful and
manageable numerical indices. These basic data provide the raw material for descriptive analyses
and serve as the basis for inferential statistics. The types of statistics used to describe and summarize
data as well as the methods used to make inferences are, in part, a function of the type of data. Two
interrelated aspects of data will be briefly discussed: its scalar properties; and whether the data are
continuous or discrete. Scalar properties concern the levels of measurement associated with different
types of data. In the social sciences, these levels are called nominal, ordinal, interval and ratio, and
basically concern the relationships of the assigned numbers to what is being measured.

Nominal scales

Arrange the data into mutually exclusive categories. Each category is separate and does not have any
numerical relationship to any other category. This type of measurement merely classifies objects,
events, people, behavior, etc. Examples are: racial groups, types of crime (felony-misdemeanor,
victim-victimless) and sex.

Ordinal scales

Arrange the data in terms of “greater than” and “less than.” Here measurement (the numbers assigned
to observations, events, people) signifies an ordering. The observations, etc. can be rank ordered in
terms of one or more quantities. For example, different categories of crime (murder, assault with a
deadly weapon, burglary, drug use, speeding) can be ordered with regard to judged seriousness.

Interval scales

Order the data not only with regard to order in terms of more or less, but also into equal distances
between adjacent numbers on the scale. Thus, the distance between any two adjacent numbers
assigned to events, observations, people, etc. at any point along the scale is equal with regard to
characteristic(s) being measured.

Ratio scales

Include the properties of all the above in addition to adding an absolute zero at the origin of the scale.
This allows the numbers assigned to what is being measured to be related to each other as ratios, a
property which the other types of measurement do not possess. Thus, a score of 20 on a ratio scale
would be twice as much as a score of 10, or one-third of a score of 60, in terms of the units of
measurement. Ratio scales are not often found in social sciences, although there are scaling
techniques that may be employed to construct them. Measurements dealing with costs may be
considered to have properties of ratio scales.

2

There are techniques of developing measures which have ordinal, interval or ratio properties using
human judgments of the material to be scaled and statistical analysis of the judgmental data. The
important aspect of the various types of measurement discussed concerns the type of statistical
analyses that are relevant. In general, the progression from nominal to ratio data allows for more
statistical and mathematical analysis. Another important aspect of measures with regard to statistical
manipulation is whether the measures are continuous or discrete. Continuous measures are those in
which there are theoretically an infinite or very large number of data points that can be applied, at
least at the ordinal level–that is, what is being measured has continuous gradations throughout the
range of numbers assigned. Discrete measures deal with a small number of categories or
classifications of what is being measured with no finer distinctions. The most common discrete data
are in the form of two categories (dichotomy). Time, ability, attitudes, crime seriousness may be
considered continuous data. Race, sex management style of a police or probation department may
be considered discrete data. At times, continuous data may be collapsed into a few discrete
categories. For statistical purposes, however, it still can be treated as a continuous measure. On the
other hand, discrete data sometimes may be considered to represent an underlying continuous
measure and may be treated as such statistically.

MEASURING ATTITUDES /OPINIONS

An attitude is a person’s feeling toward and evaluation of some object or event. Attitudes have two
important aspects:

Direction (positive/negative, for or against); and
Intensity (strength of feeling).

For example, you might like horses – thus, your attitude towards horses has a positive direction. If
you are crazy about horses, your attitude toward them has a high level of intensity. You would be
intensely positive toward horses. Because attitudes are so much a part of human behavior,
researchers have spent a great deal of time figuring out ways to measure attitudes.

There are numerous attitude scales. One of the more common approaches is the Likert scale.

Here’s the bottom line for the busy reader. Over time and in common usage, the term, “Likert scale”
has come to be applied to things far removed from its original meaning.

Most importantly: A Likert scale is a multi-item scale, not a single item. A single item, regardless
of its format, should not be called a Likert scale.

A Likert item, which is a single item or question, should adhere to certain format requirements. An
item that is merely ordered-categorical, even if it is combined with similar items in a composite
scale, should not be called a Likert item or a Likert-type item.

3

People have gradually come to use the term, “Likert scale” in very different ways. It is variously
applied it to both groups of items and to single items, and in either case there is disagreement about
what specific formats apply.

This is a not good in general, since we would like mutually agreed-on definitions. Otherwise if a
researcher says, “We used a Likert scale” it isn’t clear what’s meant.

Further, there is lack of consensus about what statistical methods are appropriate for this class of
variables. This is an important matter because such variables are often used in serious applications
like clinical trials. To make headway on the issue of what statistical methods are appropriate for such
variables (and this is the subject of some degree of controversy), we must first agree on terms.

The Origin of Likert Scales

Likert scales were originally developed by Rensis Likert, a sociologist at the University of Michigan
from 1946 to 1970. Likert was concerned with measuring psychological attitudes, and wished to do
this in a “scientific” way. Specifically, he sought a method that would produce attitude measures that
could reasonably be interpreted as measurements on a proper metric scale, in the same sense that we
consider inches or degrees Celsius true measurement scales.

Other social scientists, such as Thurstone, had already developed sophisticated methods for
measurement of psychological phenomena, but these were unsuited for Likert’s attitude research.
Likert, after trying various alternatives, gradually developed what we now call Likert scales.

Likert used a number of specific techniques to first generate items, and then select from among them
those that were valid, unidimensional (all measuring a common trait), and well discriminating. For
example, he sometimes used judges to rate items’ quality or content. All of these methods
collectively go into what is formally called Likert scaling. Without failing to appreciate Likert’s
contributions to the science of scaling, we use the term “Likert scale” in a somewhat broader sense
here to include basically any scale composed of Likert or Likert-type items. That is, we distinguish
between Likert scaling and Likert scales, the former term being more specific. (We thereby avoid
having to introduce yet another category, Likert-type scales.)

Once constructed, Likert’s scales have a format like this:

Please indicate how much you agree or disagree with each of these statements:

4

Example 1. A Likert scale

Neither
agree

Strongly Somewhat nor Somewhat Strongly
disagree disagree disagree agree agree

————————————————————————
The president is 1 2 3 4 5
doing a good job.

The Congress is 1 2 3 4 5
doing a good job.

The Secretary of 1 2 3 4 5
Defense is doing a
good job.

Here the construct being measured might be attitude towards American politics.

By Likert’s method, a person’s attitude is measured by combining (adding or averaging) their
responses across all items. This summing or averaging across several items was essential for Likert
to contribute to genuine measurement.

There are several characteristics or features that define a Likert scale:

1. The scale contains several items.
2. Response levels are arranged horizontally.
3. Response levels are anchored with consecutive integers.
4. Response levels are also anchored with verbal labels which connote more-or-less evenly-spaced

gradations.
5. Verbal labels are bivalent and symmetrical about a neutral middle.
6. In Likert’s usage, the scale always measures attitude in terms of level of agreement/disagreement

to a target statement (but see below).

Criterion 5 usually means there is an odd number of response levels. Typically the number is 5,
though sometimes 7, 9, or 11 levels are used.

Only a scale with all these characteristics might qualify as a genuine Likert scale. We probably don’t
want to be that strict, however. In particular, it seems reasonable to apply Likert’s methodology to
domains other than attitude measurement. The view recommended here is that features 1-4 above
comprise the main requirements for what can be accurately termed a Likert scale.

This much said, we now address two of the biggest and most common confusions people make.

5

Common Error 1:

A Likert scale is never an individual item; it is always a set of several items, with specific format
features, the responses to which are added or averaged to produce an overall score or measurement.

A single item, even if formatted exactly as one of Likert’s items, is not a Likert scale.

The confusion is understandable, however, since each item of a Likert scale itself has scale-like
appearance. But these are definitely to be distinguished from the Likert scale proper, which is made
up of the entire set of items.

Likert items

The question then arises: so what *should* we call single items of this kind? Is there a term by which
we may distinguish them from other kinds of items, such as ordinary multiple choice ones?

If features 2 through 5 above are all present, we may justifiably call them Likert items. If only 2
through 4 are present, we might call them Likert-type items instead.

Example 2: A Likert item

How do you feel about the President’s performance in domestic affairs?

Strongly Somewhat Somewhat Strongly
disapprove disapprove Neutral approve approve

1 2 3 4 5

Consider the example above. Here we meet all criteria 2 – 5. It seems fair to call this a Likert item,
even though it doesn’t refer to agreement/disagreement to a target statement.

How far can we legitimately broaden the definition? This is a judgment call on the part of the
researcher. It is this writer’s opinion that only criterion 5–that the anchor labels be bivalent
(distinctly two-directional) and symmetrical–may be relaxed, and then this produces a Likert-type
item. Without conditions 2-4 the item is basically not Likert-type in any sense.

The following, then, would be considered a Likert-type item.

Example 3: A Likert-type item – How often do you go out to see a movie?

Very
Never Sometimes Average Often Often
1 2 3 4 5

6

Here the response levels are not bivalent: the lower terminus is merely “Never”. There is no exact
opposite of “Very often”. Yet the categories are reasonably interpretable as evenly spaced, especially
when associated with consecutive integers in an evenly-spaced printed format. The label for level
3, “Average” clearly denotes centrality of this response category.

Similarly, we seem reasonably justified in permitting an even number of response levels, and, along
with this, that there not be an exact middle or neutral category–provided the other criteria are
maintained.

There are shades of gray here, so it is difficult to provide a universally applicable set of rules. But
certainly the further away an item is from the criteria shown, the less inclined one should be to refer
to it as a Likert item or Likert-type item. In particular, most regrettable is the tendency, not
uncommon today, to refer to any ordered category item as Likert-type. We may treat this kind of
error rather summarily, as follows:

Common Error 2
This is not a Likert-scale, a Likert item, or a Likert-type item:

How often do you smoke cigarettes?
1. Never
2. Once in a while
3. 1-5 per day
4. More than 5 per day.

This is simply an item with ordered response levels, or an ordered-category item. This is so even
if the item is one of several that will be combined to form an aggregate scale. In this case, one simply
has a summated rating scale comprised of several ordered-category variables.

Constructing a Likert scale
The first step is to specify the attitude to be measured. In this example we will use attitude toward
mathematics.

Step 1. Collect statements
Generate as many statements as possible covering all aspects of the issue (both pro and con). Do
in-depth interviews on the topic, ask colleagues, survey the literature. Here are some examples:

Math is one of my worst subjects. I like doing math.
Math is a science. Men are better at math than women.
I am no good in math. I got good grades in math.
I need mathematics for my future career. Math is difficult for me.
I am confident that I can learn math. Math is an important subject to learn.
I can handle most subjects, but not math.
I will not need much math when I get out of school.
These days math instruction at the high school level is of poor quality.

7

Step 2. Judge direction
For this step you need to recruit some judges (20 or more people). You will ask them to rate the
direction of the statement. Does the statement reflect a positive or negative attitude toward math?
You do NOT want their opinion on the item — this sometimes takes a bit of convincing.
Present the collected items in the following format:

Instructions: Please rate each of the following items with regard to its favorability toward math
(circle the appropriate number). Do not respond in terms of your own agreement or disagreement
with the statements; rather, respond in terms of the judged degree of favorableness or
unfavorableness.

Item Very Very
Unfav Unfavorable Neutral Favorable Favorable

1. Math is one of my worst subjects. 1 2 3 4 5
2. Math is a science. 1 2 3 4 5
3. I am confident that I can learn math. 1 2 3 4 5

Step 3. Discard neutral (or unable to judge) statements
Keep only the items where at least 90% of the judges agree as to direction (favorability rating).
Eliminate the statements rated as Neutral/Unable to judge, or those for which judges differ in their
opinions (less than 90% agreement). The following statements are not directly for or against math
and would be eliminated:

• Math is a science.
• These days math instruction at the high school level is of poor quality.
• Men are better at math than women.

Step 4. Format items to measure intensity
NOTE the different instructions and labels at the top of the columns. This is how the final attitude
scale will be presented to respondents.

Instructions: Please indicate your level of agreement with each of the following items (circle the
appropriate number).

Item Strongly Strongly
disagree Disagree Neutral Agree Agree

1. Math is one of my worst subjects. 1 2 3 4 5
2. I am confident that I can learn math. 1 2 3 4 5
3. Math is difficult for me. 1 2 3 4 5

Counterbalance the items by alternating positive and negative statements. It is OK to have more of
one type than the other, but be sure to mix them up on the form.

8

Step 5. Pilot test (pre-test)
Before printing the final version, pretest the form on a few people that will not be in your final
sample. There will ALWAYS be something that needs to be corrected – unclear directions, an
ambiguous item, incorrect numbering, typos, etc. It is best to find them before you print hundreds
of copies.

Scoring
After the respondent fills out the attitude survey, the researcher must reverse score the negative
items (determined in Step 2 above) so that all of the individual item scores lie on the same scale with
regard to direction. In reverse scoring, the 5 becomes 1, 4 becomes 2, 3 stays the same, 2 becomes
4 and 1 becomes 5. The reason is that we want to obtain a single score reflecting the intensity in a
single direction – that is, we want a high overall score to reflect a positive attitude and a low overall
score to indicate a negative attitude. If someone strongly agrees with “Math is difficult for me,” the
attitude toward math is negative. Although the person has circled 5 on the form, that item (being
negative) is scored as a 1.

After the scores on the negative items are reversed, sum the individual ratings. Either a total score
or the average is used to characterize the individual’s attitude.

After the questionnaire is completed, each item may be analyzed separately or item responses may
be summed to create a score for a group of items. Hence, Likert scales are often called summative
scales.

Responses to a single Likert item are normally treated as ordinal data, because, especially when using
only five levels, one cannot assume that respondents perceive the difference between adjacent levels
as equidistant. When treated as ordinal data, Likert responses can be analyzed using non-parametric
tests, such as the Mann-Whitney test, the Wilcoxon signed-rank test, and the Kruskal-Wallis test.

When responses to several Likert items are summed, they may be treated as interval data measuring
a latent variable. If the summed responses fulfils relevant assumptions, parametric statistical tests
such as the analysis of variance can be applied. These can be applied only when the components are
more than 5.

Data from Likert scales are sometimes reduced to the nominal level by combining all agree and
disagree responses into two categories of “accept” and “reject”. The Chi-Square, Cochran Q, or
McNemar-Test are common statistical procedures used after this transformation.

Note: The five response categories represent an ordinal level of measurement. The categories
represent an inherent order (more to less, stronger to weaker, bigger to smaller), but the numbers
assigned to the categories do not indicate the magnitude of difference between the categories in the
way that an interval or ratio scale would.

9

Validating the scale
There are 3 ways to demonstrate that a Likert scale is valid, that is, that it measures the attitude that
it purports to measure in a credible way.

1. Item/whole score comparison
The form with all the statements is given to at least 100 respondents. For the final scale, keep only
those statements that differentiate between the highest-scoring 25% (most positive toward math) and
the lowest-scoring 25% (most negative toward math) of respondents. A drawback of this approach
is that it requires generating a lot of items, in order to be sure to have some that differentiate the two
extreme attitude groupings.

2. External criteria
Locate groups of people likely to have strong attitudes for and against the issue, for example
engineering versus art history majors with regard to the importance of mathematics. Collect their
opinions, and, as above, for the final attitude scale keep only the statements that differentiated the
engineers from the art historians.

3. Factor analysis
Factor analysis is a statistical technique for identifying items that hang together. It requires a large
sample, and knowledge of the statistical procedure.

General points regarding Likert scales
Respondents rate their degree of agreement with the statement. Their response shows both the
direction (for or against) and intensity (strength) of their attitude. All statements on the scale must
be either positive or negative. The respondents may feel neutral about the statement, but the
statement itself cannot be neutral. Wording on the alternatives can vary, as can the direction. The
numbers can go in either direction, but keep the direction the same for all the items.

1 2 3 4 5
Strongly disapprove Disapprove Uncertain Approve Strongly Approve

Strongly agree Agree Neutral Disagree Strongly Disagree

Favor Slightly favor Not favor or oppose Slightly oppose Oppose

If the neutral or uncertain category is omitted, the item becomes a forced choice option. Some
researchers assume that everyone has an opinion on everything and therefore should not be allowed
to avoid making a choice. Others feel that the respondent may indeed feel neutral towards something,
and should be given that response option.

How many alternatives? The consensus is that 5-7 works best, but the rule is not rigid. Sometimes
having 3 alternatives — Agree, Neutral, Disagree — may be sufficient. In other cases the number can

10

be extended to 9 or 11. Increasing the number is useful when respondents are likely to avoid
checking the extreme options.

Limitations
All items, regardless of intensity, are given the same weight. With regard to the final attitude score,
strongly agreeing with “I am confident I can learn math” carries the same weight at strongly agreeing
with “Math is an important subject to learn.”

The format does not lend itself to dealing with mixed or complex attitudes. For example, “Math is
an important skill for computer programming, but of less use in politics.” Statements that fit the
requirements for a Likert scale may not be getting at more complex attitudes and feelings.

Attitude scales are of limited validity. They don’t predict behavior very well. Words on a printed
page or computer screen bear little resemblance to actual situations. Opinions on a topics such as
marijuana use or hate speech restrictions are complex and multidimensional. They might not be
reducible to a series of one dimensional items. As with questionnaires in general, self-reports of
attitudes and behavior are strongly influenced by the context, format, and wording of items.

A scale can be useful when included on a questionnaire along with additional items for increasing
validity, for example, using an attitudes toward marijuana scale, along with open-ended questions
about when marijuana use might be OK, who should have access, or how its use affects
communities.

Thurstone scale
In an attempt to approximate an interval level of measurement, psychologist Robert Thurstone
developed the method of equal-appearing intervals. This technique for developing an attitude
scale compensates for the limitation of the Likert scale in that the strength of the individual items
is taken into account in computing the attitude score. It also can accommodate neutral statements.

Constructing the scale

Step 1. Collect statements on the topic from people holding a wide range of attitudes, from
extremely favorable to extremely unfavorable. For this example, we will use attitude toward the
use of marijuana. Example statements are:

• It has its place.
• Its use by an individual could be the beginning of a sad situation.
• It is perfectly healthy; it should be legalized.

Step 2. Duplicates and irrelevant statements are omitted. The rest are typed on 3/5 cards and given
to a group of people who will serve as judges.

11

Step 3. Originally, judges were asked to sort the statements into eleven (11) stacks representing the
entire range of attitudes from extremely unfavorable (1) to extremely unfavorable (11).

The middle stack is for statements which are neither favorable nor unfavorable (6). Only the end
points (extremely favorable and extremely unfavorable) and the midpoint are labeled. The
assumption is the intervening stacks will represent equal steps along the underlying attitude
dimension. With a large number of judges, for example, using a class or some other group to do the
preliminary ratings, it is easier to create a paper-and-pencil version.

Rate each of the following statements indicating the degree to which the statement is unfavorable
or favorable to marijuana use. Do not respond in terms of your own agreement or disagreement with
the statements; rather, respond in terms of the judged degree of favorableness or unfavorableness.
Place an X in the interval that best reflects your judgment.

For example:
1. Marijuana is OK for most people, but a few people may have problems with it.

2. If marijuana is taken safely, its effect can be quite enjoyable.

3. I think it is horrible and corrupting.
.

4. It is usually the drug people start on before addiction.

Remind the judges to rate favorability with regard to the target (marijuana), not to give their opinion
as whether they agree or disagree with the statement.

12

Step 4. Each statement will have a numerical rating (1 to 11) from each judge, based on the stack
in which it was placed. The number or weight assigned to the statement is the average of the ratings
it received from the judges.

Statement Average rating from 20 judges (11 = extremely favorable)
If marijuana is taken safely, its effect can be quite enjoyable. 8.9
I think it is horrible and corrupting. 1.6
It is usually the drug people start on before addiction. 4.9

If the judges cannot rate the item on its favorability or show a high degree of variability in their
judgments, the item is eliminated. For example, the statement “Marijuana use should be taxed
heavily” was rejected because it was ambiguous. Some judges thought it was pro-marijuana as it
implied legalization; others though it was anti-marijuana because it advocated a heavy tax.

Administering the scale
Here is the final form. The respondents check only the statements with which they agree. The
average ratings by the judges are shown in parentheses. These would not be included on the actual
form given to respondents. Note that the more positive statements have a higher weight.

This is a scale to measure your attitude toward marijuana. It does not deal with any other drug, so
please consider that the items pertain to marijuana exclusively. We want to know how students feel
about this topic. In order to get honest answers, the questionnaires are to be filled out anonymously.
Do not sign your name.

Please check all those statements with which you agree.

—-1. I don’t approve of something that puts you out of a normal state of mind. (3.0)
—-2. It has its place. (7.1)
—-3. It corrupts the individual (2.2)
—-4. Marijuana does some people a lot of good. (7.9).
—-5. Having never tried marijuana, I can’t say what effects it would have. (6.0)
—-6. If marijuana is taken safely, its effect can be quite enjoyable. (8.9)
—-7. I think it is horrible and corrupting. (1.6)
—-8. It is usually the drug people start on before addiction. (4.9)
—-9. It is perfectly healthy and should be legalized. (10.0)
—-10. Its use by an individual could be the beginning of a sad situation. (4.1)

Scoring
The weights (i.e., favorability rating) for the checked statements are summed and divided by the
number of statements checked. A respondent who selected #3, #7, and #8 would have an attitude
score of 2.2 + 1.6 + 4.9 = 8.7/3 = 2.9. Dividing by the number of statements checked (3) puts the
score on the 1-11 scale. A score of 2.9 indicates an attitude that is definitely unfavorable to
marijuana.

Continue to order Get a quote

Calculate the price of your order

Type of paper needed:

Pages:

550 words

Academic level:

We'll send you the first draft for approval by September 11, 2018 at 10:52 AM

Total price:

$26

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

Free title page and bibliography
Unlimited revisions
Plagiarism-free guarantee
Money-back guarantee
24/7 support

On-demand options

Writer’s samples
Part-by-part delivery
Overnight delivery
Copies of used sources
Expert Proofreading

Paper format

275 words per page
12 pt Arial/Times New Roman
Double line spacing
Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Likert scale

Products

Recent Posts

Calculate the price of your order

Our guarantees

Money-back guarantee

Zero-plagiarism guarantee

Free-revision policy

Privacy policy

Fair-cooperation guarantee