Training, like everything else in the twenty-first century, is undergoing a transformation to be data driven and focused on outcomes. However, most of us will agree that it’s not always clear whether training delivers on those outcomes. What appears at first to be a training outcome might prove, with thorough data analysis, to be unrelated. Tools like Power BI can help you determine the your real training ROI.
Causation or mere correlation?
To explore the link between training and performance improvements, it’s necessary to understand the difference between correlation and causation.
Correlation is when multiple events occur at the same time. For instance, if you are delivering training to reduce safety incidents, and safety incidents decrease after you deliver the training, then there is a strong correlation between them.
However, causation is knowing that modifying one variable will have a causal (direct) impact on another variable or outcome of the system.
How do you know if an effect is simply correlated or causal? Did the training actually cause the improvement in your safety metrics? You may be thinking this is obvious, but consider the following:
Your organization has developed an online training module and rolled it out. Following the rollout, key metrics improve. You may be tempted to think that the module caused the improvement.
However, along with the actual training module, perhaps the organization:
- Spent time thinking about the material that would be in the module
- Codified best practices, which solidifies in the minds of managers what those best practices are
- Built internal marketing material to tell employees why they should take the training
- Kept talking about completing the module—which included actually talking about the importance of reducing safety incidents, which raises everyone’s general awareness
- Had controversial elements in the training, which triggered debates among employees—again, talking about safety practices and raising awareness
So, which of those aspects actually had the positive impact? It might have been the training course module; it also might have been the increase in the “safety culture” at the organization. This is an interesting question—without a clear answer. In this type of scenario, causation is very difficult to prove.
Data analysis can help
This is a data analytics problem. We can use Microsoft’s free data analytics platform, Power BI, to help us visualize it—and figure out whether the relationship is correlative or causal.
For correlations, we can use a scatter graph, as in Figure 1. This provides a visual image of correlations between different metrics.
Figure 1: This scatter graph shows store sales and their correlation with the mastery levels that employees at the store achieved in training
In Figure 1, the Y axis shows the level of sales and the X axis shows the mastery level. Each bubble represents a location, and the size of the bubble represents the number of units sold.
Looking at Figure 1, it’s easy to see that store with the largest number of units sold correlates with the largest sales total—and the highest mastery. But, is it causal? That is, did the higher mastery levels cause the increased sales?
To answer that, we must dig deeper.
A statistical technique called regression analysis helps determine dependence between variables in a model. It can’t actually determine causation, but it’s a helpful tool.
Identifying the ‘key influencer’
In Power BI, to find out whether a change in one variable, such as the number of units sold, is caused by another—training—you’d use what’s called the “key influencer” visual.
First, you build a data set with as much information as possible, such as:
- Demographic information, like years of employment
- Location information
- Supervisor information, like years of supervisory experience
- Test scores
- How long employees spent in training
- How many times they completed the training module
- # of past incidents
- Past sales data
Power BI performs a machine learning analysis on this data and looks for dependencies.
Let’s use a common example to see what that looks like. What factors do you think most influence the cost of a home? Figure 2 uses the key influencer visual to determine which factors have a direct impact on house prices.
Figure 2: The key influencer visual clearly shows that “kitchen quality” is the largest single contributor to a change in house value
Using this visual, PowerBI shows that the largest impact is actually the KitchenQuality metric. Imagine what you could uncover using your own training data.
The key influencer and other visualizations that you can easily create using Power BI can help you determine which factors have the largest impact on your performance metrics. You may be surprised to discover that it’s not the factor you think!
Learn to use Power BI
Dan Belhassen will present a bring-your-own-device (BYOD) session, “Use Power BI to Track, Analyze, and Visualize Training Data for Free,” at DevLearn 2019 Conference & Expo, October 23–25, in Las Vegas. Gain the basic skills to figure out whether your training is driving improvement in the metrics you care about most.