The business environment, learner profiles, training environment, and IT infrastructure are all things that instructional designers consider in their design plans. For many in the instructional design space, the term big data is something that is probably neither interesting nor relevant to the craft of design.

In the coursework leading to a master’s in educational technology, any discussion about using data to inform the design process is generally tied to creating courses that improve test scores. There is nothing about designing experiences to generate a certain and specific type of data.

Where does big data fit in?

Too much and too fast for us to keep up

Let’s begin with a definition of big data:

“Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures.” (Source: Edd Dumbill, writing in O’Reilly Radar)

In other words, big data is the term the technology world uses to describe the problem of data gathering tools amassing data at rates, volumes, and speeds that cannot be parsed and analyzed in intelligent ways by our current set of tools.

Big data is akin to a production and distribution process where production is moving faster than distribution. What’s left is overflowing shelves and storage units full of product that continues to amass at exponential rates, with no means to organize and distribute that product.

So how does this problem affect instructional design?  Part of our job has been to assess the effectiveness of our designs in helping to achieve the desired outcomes of our initiatives. We have traditionally done this by measuring whether we were able to affect test scores (beginning with a pre-test and concluding with a post-test), whether learners were happy with their learning experience (measured through smile sheets), and, in the ideal fictional world, whether our initiatives impacted business results.

Why do I call this last element the “ideal fictional world?” It’s mostly because, aside from a few rare cases, training initiatives are hard to isolate as a significant factor in business result transformations. Rarely is training something that a manager can isolate as a factor. It’s also important to note that all of these data points are “after the fact” touch points. None of these data points are meant to be collected during the learning experience itself.

Dealing with big data in learning initiatives

In today’s digital business, the use of computer-generated data to support business decisions is the norm. Enter the problem of big data, where there is simply too much data being gathered too quickly and being sent too quickly to make the “right” decisions. All that being said, the science of analytics and data science is helping businesses understand what data to focus on, given a company’s key initiatives.

Solutions to big data, regardless of the tools you’re using, will always require a strategy to mine the data you need to make the right business decisions. The sheer volume of data available, and the representation of its currency (real time), makes the science of analytics that much more exciting and relevant to making intelligent business decisions.

Given the power of big data to inform business decisions, posting web content that feeds the data stream you’re mining is a skill that will rise in demand. For the most part, web content creators develop their content and then try to find the data streams that give them the closest match to their intentions for posting as possible.

Feed the data stream

The alternative to that approach is to design web content specific to the data you are looking for. This goes back to the old adage of not collecting data that you’re not going to use. Given the power of the technology and toolsets available, we are now able to design content and build user experiences to feed a specific data stream that is aligned with the measurement of business goals and the data feed that informs business decisions.

Use the right tools: take control

The introduction of TinCan to the technology-based learning industry gives the learning and development world an infrastructure with which to begin the collection of data in real time (this is key) that we deem relevant to our businesses. If you quickly read the last sentence, the important part is the phrase “that we deem relevant.”

In other words, we have the opportunity to begin collecting real time data during (that’s right—during) user experiences of our content in much the same way website analytics collects real-time data. Before TinCan there were industry-specific analytic standards; for example, the aviation industry had a standard that allowed it to collect real-time data during flight simulations and match that simulation-generated data to in-flight data to identify performance gaps in pilots.

But TinCan is the first infrastructure that gives anybody who designs web experiences for teaching and learning (including mobile web) the opportunity to dictate what data they intentionally want to gather, as opposed to simply trying to figure out how to use whatever data they happen to gather. It’s a reversal of the process.

How do you know they’re engaged? An example

Now, where it gets interesting is the idea that we can include in our design strategy the notion of certain activities or experiences delivering data to a stream that feeds our goals.

In a recent blog post I gave a very simple example of building an online orientation course that feeds a data stream dedicated to measuring learner “engagement.” A goal for most orientation programs is to engage new hires with the company—to get them to appreciate a company’s history so that they feel that they are part of a legacy.

If you’re having someone read content online, there is no way to tell if they actually read the content or merely stare blankly at it. The same is true for paper-based reading, of course. So how can we get real-time data that feeds a stream measuring engagement? It’s good to note that we don’t have to “prove” engagement; we simply want to create data that supports the case for engagement. In other words, we are not trying to prove causation, just correlation.

Well, one approach that might help measure whether someone is reading and engaged with content is to: 1) Take off any parameters that force somebody to read anything because that just forces them to click “Next.” Make reading optional. 2) Separate content into small chunks that require navigating from one content piece to the next as an optional navigation element. 3) Attach data gathering to the action of moving from one content piece to the next. 4) Analyze things like, How deep the users go into the content? How much time for each piece of content matched to their navigation? Is it all relatively the same, or are users spending a decreasing amount of time per content piece? As a collection of data, this set is looking at the real-time interests of the users. Are they interested in moving forward? How much? Are they as enthusiastic at the beginning as they are at the end? Do some content pieces pique more interest than others?

Create intentional designs

The example given isn’t a recommendation for designing orientation programs. The example is to illustrate an intentional design that feeds real-time data matched to a specific goal. The picture gets increasingly complex when you take something like new hire engagement and correlate that with worker performance and potentially even something like correlating new-hire engagement with company brand recognition. If the reason you’ve hired marketing people is to get better brand recognition on the Web, then designing learning experiences that feed a set of data streams that will allow you to analyze the success of that correlation ought to be as important to an instructional designer as choosing the right visuals.

Here lies the intersection of instructional systems design and big data. Now that new emerging standards like TinCan are being set in motion, there can be no excuses about the lack of value from learning analytics. Now the instructional designer can build content in a very specific way, to generate very specific sets of data that THEY determine are beneficial and valuable to the organizations they work for.

The trick for new and old instructional designers is stop trying to “prove” that training and learning do x, y, and z. Instead, build a supporting case that training and learning were systematically part of the business achieving its objectives. In other words, we didn’t cause the business to achieve its objectives, but our data supports the system for helping the business achieve its objectives. You can do this by designing your interventions to work within the system (I’m not talking LMS, or any technology, for that matter) and generating data that’s important for the business.

Do you need engaged employees? Generate real time data for engagement!