How can you know whether a learning experience (a course) teaches what it was intended to teach? How do you know whether all of the participants had equally successful experiences? There are several ways to answer those questions, but for the sake of readers who are new to eLearning I am going to give a brief history of the technology that makes evaluation work.

Basic evaluation

At first, learning outcomes were evaluated through administrative records (attendance and course completion), learner evaluations (“smile sheets”), and criterion measures (test scores). All of these are still in use, although only criterion tests provide a more or less objective indicator of learning. In the 1990s, eLearning proliferated to the point that each eLearning application needed its own suite of supporting products. The US Department of Defense dealt with this situation by designing a software management specification to ensure near-universal compatibility of all eLearning products. That specification came to be known as SCORM.


SCORM (Shareable Content Object Reference Model) is the technical specification for eLearning products, intended to ensure that nearly every learning management system will recognize eLearning that conforms to the specification. This was an early attempt to measure whether learning outcomes were met, through administrative records (attendance and course completion), learner evaluations (“smile sheets”), and criterion measures (test scores).

What was wrong with criterion measures, administrative records, and SCORM? They don’t actually measure learning and they don’t provide reliable information upon which to improve learning experiences. Once again, an organization within the US Department of Defense (ADL, or Advanced Distributed Learning) initiated work to correct these deficiencies: the Experience API.


The Experience API (xAPI) is a data standard for reporting learning activities. xAPI supports identification of places where learning interventions might be needed, measurement of the impact of blended learning experiences, and documentation of places where learning experiences could be improved. Work continues in the eLearning community to improve measurement of learning and cross-compatibility of LMS and learning applications. But there is one more thing. Much eLearning still uses multiple choice tests to assess learning. This is a serious deficiency, and there is a proven and reliable method for dealing with it: Item analysis.

Item analysis

Item analysis was (and still is) a statistical method for improving some types of criterion tests (multiple choice) in order to adjust test items to a uniform level. In other words, to assess what students have learned and the reliability of the tests as a whole. At one time, item analysis was done using a spreadsheet and a lot of statistical calculations. Today there are apps that do the work, and xAPI will help.

Ordinarily in this type of article, I provide links to software reviews on the web. However, that will not work with this topic: too many synonyms. You are welcome to do your own searching. I would recommend search terms such as “Multiple Choice”, “Test Management”, and “Exam Maker” or “Exam Builder”.


Rather than make this article any longer than it has to be, I am going to cite three excellent tutorials.

Mary Ann Tobin, PhD. Guide to Item Analysis

Jason Haag, Don’t Just Give Me All the Data – Align KPIs With xAPI

Peter Berking and Steve Foreman, xAPI and Analytics: Measuring Your Way to Success