Writing Multiple-Choice Questions for Higher-level Thinking

We eLearning developers are used to the question, “Which is better, eLearning or classroom instruction?” The answer is, “It depends.” It’s the same answer if one asks, “Which are better, multiple-choice or essay questions?” Either question type is useful for assessing a variety of levels of thinking, depending on how well the designer crafts the questions. Designing multiple-choice questions is not as daunting a task as one might think.

What is higher-level thinking?

What do we mean by higher-level thinking? Benjamin Bloom described six levels of cognitive behavior, listed here from the most basic – Knowledge – at the bottom to the most complex – Evaluation – at the top:

Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge

Bloom’s taxonomy offers one way of looking at increasingly complex cognitive abilities. For example, Knowledge and Comprehension mean a person can recall facts or paraphrase a concept. Synthesis, on the other hand, means a person can create something new, such as an essay or a painting. (Please see the list of References at the end of this article for the sources of ideas presented here.)

J. P. Guilford offered another way of looking at cognition with his description of convergent and divergent production. Convergent thinking means someone is working with knowledge, processes, concepts, etc, that exist; it has a certain correctness about it. When applied to test questions, convergent thinking means there is a preexisting correct answer. Verbs for convergent thinking include select, identify, calculate, label, and diagnose. Conversely, divergent thinking means there is not a preexisting correct answer. The person must take existing knowledge and create new knowledge. As Marie Hoepfl explains, verbs for divergent thinking objectives include create (a poem or story), compose (a song), etc.

Mapping Guilford’s concepts onto Bloom’s taxonomy, convergent thinking applies to Bloom’s first four levels of cognitive behavior, that is, up through Analysis, and divergent thinking applies to Bloom’s top two levels, Synthesis and Evaluation. See Table 1.

Table 1. Bloom’s taxonomy with representative verbs

Taxonomy Level	Representative Verbs
Evaluation	Critique Summarize
Synthesis	Organize Design
Analysis	Compare Categorize
Application	Organize Solve
Comprehension	Distinguish Match
Knowledge	Identify Label

This combination thus suggests that the designer can write multiple-choice questions for Bloom’s first four levels of cognitive behavior (Knowledge, Comprehension, Application, and Analysis) since they require a predictable or calculable answer.

On the other hand, Bloom’s top two levels – Synthesis and Evaluation – being divergent thinking, are best tested with fill-in or essay questions since a predetermined correct answer does not exist.

It starts with the objectives

Before we look at specific techniques, let’s be clear about one thing. We’re not talking about making multiple-choice tests artificially difficult. Rather, when the learning objectives dictate assessments at higher levels, we need the tools to meet that requirement. In the eLearning world, we are pretty much confined to multiple-choice or similar selected-response questions. Even those instructors who conduct classroom sessions may want to augment essay questions with multiple-choice in order to take advantage of some of the latter’s efficiencies. For example, compared to essay questions, multiple-choice questions can be graded faster and more reliably by people other than the instructor, and by the computer. They can also cover a broader scope of the subject in the same amount of time it would take a student to complete one essay question.

Writing higher-order multiple-choice questions

Let’s look at the way thinking skills progress, using the cold and flu for context (Table 2). At the Knowledge level we are asking the learner to merely identify or select symptoms of a cold. At the Comprehension level we might want the learner to match symptoms with their respective ailment. At the Application level the learner must do something (or determine what they would do in real life) with the knowledge they possess. Notice that even though we’re talking about diagnosis and interpretation, there is still a predetermined correct answer. That is, this still represents convergent thinking.

Table 2. Sample behaviors for each of Bloom’s levels

Taxonomy Level	Sample behavior
Evaluation	Assess the effectiveness of that protocol
Synthesis	Develop a new protocol for treating the cold
Analysis	Compare and contrast progression of cold and flu, or Determine if a patient has a cold or the flu
Application	Describe the standard process for determining if a patient has a cold or the flu
Comprehension	Match symptoms with their associated ailments
Knowledge	Identify three symptoms of a cold

Now consider Bloom’s two highest levels: Synthesis and Evaluation. These are divergent thinking. At the Synthesis level we would be asking a person to develop a new protocol for treating the cold, and at the Evaluation level we would ask them to assess the effectiveness of that protocol. Neither of those outcomes can be predetermined. Thus they are not suitable for multiple-choice questions; later I’ll suggest a way multiple-choice questions support pseudo assessment of those levels.

Specific techniques

Here are some specific techniques gleaned from the literature and my own experience.

Transform existing items

You can transform existing items that were written for lower cognitive levels such as recall of facts, according to guidelines from Penn State’s Schreyer Institute. One note of caution: even if your question is written at a higher level of knowledge, if you use statements or examples that were mentioned in reading assignments or the presentation, then the student may be doing nothing more than recall.

One way to move up from the knowledge level to the comprehension level is to ask the learner to distinguish whether statements are consistent with a principle, concept, or rule. For example, say you had a knowledge-level question that merely asked the learner to select the common symptoms of the flu from a list. You could transform the question by describing a patient who presents with certain symptoms and asking the learner to determine whether those symptoms are consistent with the flu or not. Scenarios or situations like this are good ways to set up questions to assess higher-level thinking.

You could raise the question another notch by having the learner compare and contrast symptoms. For example, rather than just determining if the patient appears to have the flu, you could have the learner determine whether the patient is likely to have a cold, the flu, or severe allergies. Obviously a question like this requires careful selection of terminology so the question truly distinguishes between those learners with complete vs. partial knowledge.

So, to generalize, if you have an existing question that states the rule and then asks the learner to identify one characteristic of that rule or concept, you can often flip it by presenting the characteristic in the question stem, and then asking the learner to identify the rule or concept.

I used this technique a lot in compliance training. Compliance dilemmas do not present themselves in the real world with their label. They present themselves through people’s actions and words, and then we have to recognize what kind of situation is developing. So in our compliance training we described workplace scenarios and then asked the learner to identify what kind of compliance issue was developing and what was the appropriate response. This not only raised the questions to higher level thinking, it made the training much more realistic than merely categorizing or labeling terms.

Use plausible distractors and new examples

Another way to transform existing questions is to ensure you are using plausible distractors. You can often do this with anticipated wrong answers. Here is an example:

Calculate the median of the following numbers: 15, 27, 27, 44, 67, 75, and 81.

The student must recall the definition of median and then apply that definition to the list of numbers. You will recall that the median is the number at the midpoint of a distribution. That number is 44 in this question. A common mistake is to confuse the definitions of median and mean (average). Hopefully our instruction will have helped learners understand the difference, but to be sure, the mean (48) is one of the distractors. So is the mode, the number with the most instances, or 27 in this case. So we have one correct choice and four distractors, two of which are plausible if the learner is not clear on the definition of these terms.

Interpret charts and graphs

We have probably all experienced questions on standardized tests, such as the SAT or GRE in the U.S., which showed two or three charts and graphs and asked questions that required us to interpret the meaning. If the subject matter allows, this is a good way to increase the level of thinking.

Premise-choice or Multi-logic thinking

Aiken described a premise-choice technique and Morrison and Walsh described multi-logical thinking. I think they are roughly the same technique.

In this kind of question the stem contains two premises and the student must select the correct conclusion or solution. For example, let’s say we need to assess the learner’s knowledge of team-building processes. One premise could be a team development model consisting of four phases, and the second premise could be different ways of communicating in each of those phases.

A knowledge or comprehension level question might name a phase and ask the learner to select characteristics of that phase from a list. A higher level question could describe the observed behaviors of team members and ask the learner to identify the preferred communication process. To answer a question like this, the learner has to first classify the team’s stage and then apply the communication rule.

Premise-choice or multi-logical questions should require a high level of discriminating judgment. These questions often use words in the stem such as best, most important, first, or most correct.

Bury the verb!

Recently I recognized a rather simple way to write multiple-choice questions for higher-level thinking. This method is totally contrary to what my English teacher taught me. Since I live in Texas, I’ll call this the Texas two-step of higher-level assessment. As the name implies, it consists of two steps:

Identify the unconstrained verb you would like to use,e.g.,
1. Describe
2. Infer
3. Evaluate
Then bury that verb by changing it to a noun and putting a convergent verb in front of it.

Often this will mean changing the verb to a “-tion” derivative. Here are three examples using this technique:

Select the best description.
Identify the most accurate interpretation.
Select a correctly constructed sentence

Depending on whether the verb you bury is convergent or divergent, this technique may be a pseudo measure, but if you must use multiple-choice questions, or if you want to increase the span of their capability, this is a practical way to do it.

Don’t give away the farm!

After going to the trouble of crafting multiple-choice questions for higher levels of thinking, be careful you don’t give away the farm. In my research for this article, I was surprised by the number of poorly written multiple-choice questions I found while randomly searching for ideas among online multiple-choice tests. It’s out of scope for this article, but I urge you to review guidelines for basic multiple-choice item construction. It is easy to find such resources on the Web.

Use higher-order tests for teaching

Finally, don’t overlook the value of higher-level multiple-choice questions for teaching. In areas where the target audience has some degree of prior knowledge, or where their life experience is relevant, I often make online courses denser by using multiple-choice exercises instead of the more traditional present-and-test format. This technique is also useful when there is room for judgment, or the preferred choice is conditional and you want the student to understand how different circumstances can affect the preferred action.

For example, a few years ago I developed a scenario-based online course on preventing sexual harassment. One of the company’s tenets was that a person should try to resolve an issue with another employee directly rather than elevating it to management. We wanted to reinforce that expectation while honoring those who may not feel comfortable taking such a direct route in a sensitive situation. So the course accepted both the preferred and other acceptable choices, with feedback that was supportive and instructive.

Remember to analyze results

Your best intentions notwithstanding, you don’t really know how well a question is going to perform until you have data to analyze after learners have taken the test. You don’t need to do a sophisticated analysis, but as a minimum you should tally up how many times each choice was selected and what proportion of the respondents got the question right. This data can reveal things like questions that are too easy or too difficult, and if distractors are working the way you intended or not. And especially, for questions that appear to be too difficult, you should investigate further to determine if the question is faulty, or if the instruction itself needs improvement.

References

Aiken, Lewis R., (1982). Writing multiple-choice items to measure higher-order educational objectives. Educational and Psychological Measurement, 1982, Vol. 42, pp. 803-806.

Bloom, B. S., Englehart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of educational objectives. The classification of educational goals: Handbook I. Cognitive domain. New York: David McKay.

Bloom’s Taxonomy, downloaded from Wikipedia 11/8/2011, https://en.wikipedia.org/wiki/Bloom’s_Taxonomy

Guilford, J.P., (1967). The nature of human intelligence, New York, McGraw-Hill

Hoepfl, Marie C. (1994) Developing and evaluating multiple-choice tests. The Technology Teacher, April 1994, pp. 25-26.

Morrison, Susan, and Kathleen Walsh Free, (2001) Writing multiple-choice test items that promote and measure critical thinking. Journal of Nursing Education, January 2001, Vol. 40, No. 1, pp. 17-24.

Schreyer Institute for Teaching Excellence at Penn State, Writing multiple-choice items to assess higher order thinking. Downloaded Nov. 1, 2011.

Taxonomy Level

Representative Verbs

Taxonomy Level

Sample behavior

Urgent Patience: Breaking Complacency, Sparking Change

From Order Takers to Strategic Advisors in 6 Steps

Mapping Your Ecosystem: Uniting Content and Data Across Systems

Using Data to Enhance the Learning Design of Microlearning Courses

Designing with Data: Using Learning Analytics to Architect Better Systems

Become a Guild Insider

Writing Multiple-Choice Questions for Higher-level Thinking

What is higher-level thinking?

Taxonomy Level

Representative Verbs

It starts with the objectives

Writing higher-order multiple-choice questions

Taxonomy Level

Sample behavior

Specific techniques

Transform existing items

Use plausible distractors and new examples

Interpret charts and graphs

Premise-choice or Multi-logic thinking

Bury the verb!

Don’t give away the farm!

Use higher-order tests for teaching

Remember to analyze results

References

Share:

Contributor

Topics:

Related

Urgent Patience: Breaking Complacency, Sparking Change

From Order Takers to Strategic Advisors in 6 Steps

Mapping Your Ecosystem: Uniting Content and Data Across Systems

Using Data to Enhance the Learning Design of Microlearning Courses

Designing with Data: Using Learning Analytics to Architect Better Systems

Become a Guild Insider