The first step of a data journey is getting an overview of your current situation. For most this is a great starting point: Overview and status reports tell us where we are, right now, but in time, you might want more information and options to create deeper insights.
That’s where historical data comes in.
Historical data is data collected about past events, circumstances, or key figures. It can give us context and give us direction. It can tell us: how did we get here.
When you use historical data in your reports it gives you insight into the difference between what it was and what is, allowing you to evaluate and modify your processes in a data-driven way.
Combining historical data and current data in reports can be things like trendline reports that can show the historical development in the numbers. It could also be something snapshot reports where you compare a snapshot from the previous month with the current in data this month and highlight changes.
In order to create these reports you need a data foundation that includes both sets of data – your current data and your historical data.
There’s typically two ways to obtain this data.
Either the traditional snapshot way – or with the change-based data capture.
Snapshotting is a time based data-capture, so at a certain time-interval, such as every month or every week, you take a copy of your data, a snapshot. This works in reporting for most needs, but it does have some disadvantages.
First of all, you need to be specific about what subset of data do you want to do a copy of; anything you don’t snapshot will be hard to dig out later.
Snapshotting is also very rigid – if you change something in your data set, adding new sources, columns, or properties it won’t be included in your snapshot until you update it.
The method is also limited in the insights you can derive from that data. That narrows down your options for doing reports and further analysis because you can’t go back and drill down into your data to understand connections or reasonings.
Another drawback is that you don’t know what happened between those snapshots – it could a whole month of your data changing but you don’t know how it’s changed. This makes it difficult to do further analysis: you’re stuck with just a moment in time.
However, snapshotting is a quick and easy way to get started with historical data – it’s just not that robust in the long term.
A more modern approach is changed-based data capture. You don’t take a snapshot at a given time, instead you record data everytime something changes.
So, if someone updates a KPI, for instance, that update is timestamped and recorded – so you know that in that period this is what data looked like. This is more efficient because you don’t make a copy of data that hasn’t changed, you only store the changed data.
Change-based data capture gives you a current picture of your data and an archive of your data, so you always have an accurate insight into what your data looked like – at any given point in time. This gives you more options when it comes to reporting and analysis.
With the entire event-time line of data and you can zoom in on any given point in time and see exactly what the data looked like. You don’t have to worry about snapshotting everything you need, you can always go back and snapshot it, if you need it.
It gives you the possibility to drill down into your data – at any point in time, and defining the granularity as well, depending on the granularity you might want to look at it every quarter, every month and you can even go down to the minute if you want to!
If you have this change-based data in your data foundation it gives you a wider set of options.
You can do audit reporting, where you zoom out and look at the changes: what happened to the data? Which changes were made, how often and by whom? You can look into business process analysis, looking at time between events or the order of events. Or you can highlight changes to see what’s happened in your projects and portfolios while you were away.
A data foundation with historical data is the backbone of AI and advanced analytics. It ensures that you can train your AI on company relevant data or unlock predictive insights and strategic foresigt with advanced analytics.
Clean, well-structured historical data enables smarter forecasting and scenario modeling and it’s how you can move your organization from reactive decision making to more proactive organizational planning.
Historical data is truly the only way to future-proof your data strategy.