Your car has been stalling and you take it to a mechanic who completes the work at the time specified for the estimated amount. That’s great news, but are these facts enough to determine if the mechanic is competent? Well, of course not. You also need to know if your car is actually fixed.
One of the hallmarks of competence in any field is the ability to produce needed results. In other words, does the result “work?” How does an observer determine this? By measuring the actual results against the needed results. For example, a competent carpenter making custom built-in bookshelves determines if the product fits, and does what is needed to assure that it does. A competent grant writer determines whether her efforts are succeeding, and adjusts her efforts to make good results more likely. Likewise, if your car mechanic is competent, he checks to make sure the fix works and doesn’t call you to pick it up until it does.
We build instruction so that “works” means that people learn. Adequate learning assessments tell us whether what we build “works,” and provides data to help us adjust our efforts.
How do people who build instruction most commonly measure the results of their efforts? For many, the most common measure is project completion. Others determine whether they delivered the project on time and within budget, or if the person requesting the project is satisfied with the results. Some determine whether the learner “likes” the instruction. All of these results have merit, but they aren’t enough. Too often the most critical result is missing.
One of the most (if not the most) critical results is whether what we build actually helps people achieve the learning objectives. Finished projects and satisfied stakeholders may be necessary results, but they are insufficient. Just as someone building an office building can’t be considered competent simply because the project is finished (even if on time and within budget) and the building owner is satisfied, we instructional designers cannot be considered competent if we can’t show that what we built helps learners learn.
Unfortunately, many trainers, instructors, instructional designers, and multimedia developers don’t know that they should design adequate learning assessments to accompany classroom, online, or blended instruction. In fact, many do not know how to design such assessments. Inadequate assessments fall short of measuring what’s needed, and can even cause significant problems for designers, learners, instructors, and organizations. In some cases, inadequate assessments can result in legal or other problems.
Let’s consider this scenario so we can refer to it throughout this discussion of assessment mistakes. An online Introduction to Workplace Safety course has the following learning objectives: Learners will be able to:
1. Identify the benefits of workplace safety to the company and staff
2. Interpret applicable federal and state laws and company policies related to workplace safety
3. Analyze and prevent common safety problems
3.1. Analyze and prevent ergonomics problems
3.2. Analyze and prevent accidents
3.3. Analyze and prevent electrical hazards
3.4. Analyze and prevent hazardous materials problems
4. Work effectively as part of a departmental safety team
Learners complete the two-hour course and then utilize a Jeopardy-type game to assess their skills. So far so good? We’ll see....
Learning assessments in the scheme of evaluation
Instruction is the process of directing learners through the content, activities, and practice, and then monitoring towards specific outcomes. Building instruction is the creation of learning experiences towards this end. Assessment helps us determine whether we achieved the desired end.
According to Benjamin Bloom (of Bloom’s Taxonomy fame), we expect well-designed instruction to help most or all learners achieve the desired results. Good instruction is specifically designed to provide the instruction, practice, feedback, and remediation needed for those results to occur and if they don’t, that’s an indication that the instruction is lacking.
Evaluation is the whole set of practices used to determine whether our efforts are efficient and effective and Donald Kirkpatrick’s four levels are a common way of discussing evaluation types. (See Table 1 below.) Assessment is a subset of evaluation, and specifically asks if the instruction helped learners achieve the desired objectives, both during instruction (level 2), and hopefully in the real world (level 3, the true goal of instruction). (Note: We are assuming that the objectives are well considered and written, which is a huge assumption because many aren’t, but this is a related rant for another time.)
|Level||Measure||Question to be answered|
|1||Reaction||How do learners feel about the instruction?|
|2||Learning||Did learners achieve the learning objectives?|
|3||Transfer||Can learners apply the knowledge and skills to the job?|
|4||Results||What is the impact on the organization?|
In my experience with the myriad folks with whom we work, people who build instruction, when designing learning assessments, often make some serious mistakes that compromise their competence and the quality of instruction they build. That’s a situation in serious need of remedying, in my opinion. In the rest of this article, I’ll describe some of typical mistakes in more detail. They include assessments that are too often:
- Given only cursory attention
- Not integrated properly into the instructional design process
- The wrong type
- Not valid (enough)
- Poorly written
Given only cursory attention
This mistake is widespread. People who build instruction often don’t realize that designing adequate learning assessments is critical to designing and developing instruction. They don’t give designing learning assessments enough time or effort. And many who build instruction don’t have the depth of foundational skills (performing task analysis and writing learning objectives) needed to write adequate learning objectives, making the job that much harder.
People who work on a design team may think that they personally don’t need to understand this process, but they do need to make sure it is well-done, which necessitates more than a cursory level of skill.
Consider the Introduction to Workplace Safety course. The team that developed this course expended a fair amount of effort developing a fun “Jeopardy-type” game as the final assessment for this course. Does the fact that they had a Flash programmer build this game and incorporate amusing sound effects rather than using a more typical multiple-choice quiz mean that the assessment received adequate attention? Actually, it doesn’t, and their efforts may have been misguided. The next few mistakes should explain why.
Not integrated properly into the instructional design process
Designing assessments should happen early, right after identifying learning objectives. Below is a typical sequence of tasks for building technology-based learning. Notice how early in the process assessments should be designed.
- Analyze tasks
- Identify learning objectives
- Design assessments
- Select instructional strategies
- Select media and delivery options
- Design content and activities
- Conduct formative evaluations
- Conduct formative and summative evaluations
- And so on ...
We want to develop assessments right after identifying learning objectives because assessments need to measure if the objectives were met. The ideal time to make sure objectives and assessments line up properly is immediately after identifying the learning objectives. If you wait until later to develop assessments (a common oversight), other parts of the design process will likely influence the assessments. This should not be the case. Rather, the assessments should influence the other parts of the design. Assessments are far more likely to be less meaningful, or to be inappropriate (as in the Jeopardy-type quiz example), if they are designed as an afterthought. Writing objectives and assessments go hand in hand.
Given the objectives for the Introduction to Workplace Safety course, does a Jeopardy-type game as a final assessment make sense? It might be a fun knowledge assessment to include in the course if resources permit, but it cannot measure the objectives as written. That’s because there isn’t a good match between the objectives and the type of assessment.
The wrong type
There are two primary assessment formats: performance assessments and “test” assessments. The former involves assessing performance in a more realistic way (in situ) and the second involves paper or computer-based forms with multiple choice, matching, fill-in-the-blank, and short- and long-answer (i.e., essay) type questions. Test assessments are by their nature a less authentic way of assessing learning, even though they are very practical.
The optimal assessment type depends primarily on whether the objective is declarative (facts: name, list, state, match, describe, explain...) or procedural (task: calculate, formulate, build, drive, assemble, determine...). Research shows that there is a big difference between these two types — the difference between knowing about and knowing how (practical application to real world tasks).
A copier technician may need to know the names of a copier’s parts (declarative knowledge) in order to find applicable information in the troubleshooting manual. But knowing part names only goes so far. Knowing how to troubleshoot the copier (procedural knowledge) involves far deeper skills. So asking copier technicians to name parts, or even to list the troubleshooting steps, is an inadequate assessment of troubleshooting skills. The bottom line is whether they can, in fact, troubleshoot — and that requires a performance assessment. When it comes to designing adequate assessments, it’s inadequate to only determine if learners know about something if it’s important to determine whether they can actually perform in the real world.
The type of assessment should match the type of objective. Declarative objectives require right or wrong answers you can easily measure with test questions. If we want to know if a person can actually perform (not just name facts), we need to design more complex types of tests (scenario questions, for example), or utilize simulations or real-life performance assessments. Table 2, below, shows how types of objectives match to their purpose and assessment type.
Type of objective
Right answer Overall measure of performance
Test or performance assessment
Overall measure of performance
Is the Jeopardy-type game in the Introduction to Workplace Safety course likely to measure performance of the listed objectives? No. Assessing if a person can analyze and prevent common safety problems, for example, requires, at a minimum, scenarios to which the person can respond. Simulations, or real-life performance assessments (like a checklist of observable behaviors), would provide even more assurance that the learner has met the objective.
Not valid (enough)
It’s easy to design less-than-optimal assessments, and we see examples of these all around us. These assessments tend to measure the wrong things and their value is dubious or even harmful. At worst, they can damage learners and organizations.
The gold standard for assessment quality is validity. A valid assessment measures what it claims to measure. For example, a copier troubleshooting assessment should measure the skills of the person doing actual or simulated troubleshooting. It’s easier than you might think to design assessments that measure something other than what is intended. Let’s say the copier troubleshooting assessment is written at too high a reading level. What is it measuring? For one thing, reading skills. Is that what the assessment designer wants to measure? Probably not. And think of the implications!
Valid assessments provide evidence that permit appropriate conclusions about whether the learner has achieved the objectives. The extent to which an assessment does this is the extent to which that information is valid for the intended purpose. For training assessments, establishing validity generally requires careful matching of job tasks, objectives, and assessment items.
Why should we care about having valid assessments? A major motivation is ethical concerns. If the instruction is important, and there are consequences for assessment results, it’s unethical to have assessments that don’t map to achievement of desired objectives. Adequate assessments help learners gain mastery, but inadequate assessments don’t. That’s simply unfair, and a waste of time and effort on everyone’s part. Plus it’s frustrating (or worse) to ask learners to show achievement in ways that don’t matter. Moreover, if assessments are used to make decisions about promotion, training, or other opportunities, invalid assessments are not only unethical but potentially illegal.
The higher the potential consequences, the more thought and effort is needed to assure assessment validity. If passing the assessment for the Introduction to Workplace Safety is needed as a prerequisite to taking more advanced safety courses, and completion of the Safety Series is part of each person’s performance review, the validity of the assessment becomes very important indeed.
Many assessments, even if they are the right kind, are poorly written. Two of the most common mistakes are confusing or ambiguous language and implausible distractors (wrong alternatives from which the learner selects the correct answer(s)). A poorly written multiple-choice question automatically lowers the validity of the assessment.
Writing good performance assessments and test questions is a skill that takes training, time, and feedback. Consider the following multiple-choice question from an ethics course:
A vendor offers you 2 tickets to the World Series. Based on the rules listed in the Vendor Gift Policy, you cannot accept them:
- Unless the tickets have no street value
- If the vendor is expecting “quid pro quo”
- If the gift is worth more than $25 or is considered to be an inducement
- Unless the vendor is also a friend
- None of the above
See any problems? Here are just a few: The language requires understanding of “quid pro quo” and “inducement” and includes confusing negatives. Distractor c has two answers in one distractor; d is obviously non-plausible; and e, “None of the above,” is not recommended as a distractor choice. Overall, this is a very poorly written question. One of the things I commonly do before presenting assessment workshops is to review assessments used in that organization. In my experience, a great number of the multiple-choice questions I review are written poorly and are written at too low a level.
A few conclusions and some words of advice
Designing adequate assessments is one the most critical tasks we do. Inadequate learning assessments are at best silly and frustrating. At worst, they can damage people and organizations. Adequate learning assessments are one of the hallmarks of competence for building good instruction and competence is sorely needed. Adequate assessments markedly improve the quality of the instruction we build.
One of the reasons many people who design instruction do not have these skills is that many graduate level programs don’t include in-depth training on designing them. Train-the-trainer and instructional design courses often talk “about” assessments (declarative knowledge) but don’t train people how to adequately design them (procedural knowledge.) In addition, many people who design instruction assume that it doesn’t require much effort, and this common-but-wrong thinking is passed down from one person to another. In other words, this thinking suffers from many of the same flaws that many of our training designs do.
The final assessment for the Introduction to Workplace Safety course suffered from all of the mistakes listed, even though the design team was well intentioned and the rest of the course contained good content and activities. The team designed the assessment as an afterthought, determining what would be “fun,” rather than what was needed to show that learners had achieved the objectives. Most of the listed (procedural) objectives mapped better to scenario-based multiple choice or performance assessments, not simple (declarative) test assessments. Because the assessment did not map to critical objectives, it wasn’t valid for the intended purposes, thus creating potentially unfavorable consequences for learners and the organization.
It’s easy to make mistakes when designing learning assessments, but designing adequate learning assessments is a skill well-worth learning. Consider what is needed to improve your competence, including courses and books and tons of practice, until you can train others (they’ll need it and you’ll benefit as well). It’s critical to quality and to the stakeholders of what we produce.
How to get started
Testing and Assessment: An Employer’s Guide to Good Practices http://www.onetcenter.org/dl_files/empTestAsse.pdf
e-Learning Centre Testing and assessment online http://www.e-learningcentre.co.uk/eclipse/Resources/testing.htm
Hale, J. (2002). Performance-Based Evaluation. San Francisco: Jossey-Bass/Pfeiffer.