Future proof your time series data
Time series measure change. What if you also included a series of notes so that later you could assess what you knew at the time? Read: "Everything Starts Out Looking Like a Toy" #131
Subscribe now for free to join curious folks who get the “Everything Starts Out Looking Like a Toy” 📊 newsletter
Hi, I’m Greg 👋! I write essays on product development. Some key topics for me are system “handshakes”, the expectations for workflow, and the jobs we expect data to do. This all started when I tried to define What is Data Operations?
This week’s toy: an emulator for a long-gone Macintosh that you can run in your browser. It’s hard to remember life before the web, but this idea takes me right there to the land of BBS uploads and downloads. It’s also the first computer that really made fonts look good. Edition 131 of this newsletter is here - it’s February 6, 2023.
The Big Idea
A short long-form essay about data things
⚙️ Future Proof Your Time Series Data
Time series measure the change from x to y by when: a value, demonstrated by its value at the beginning of a time range and at the end of that time range. The continuous line created by connecting these points implies movement over time. You see graphs like this in sales charts that track performance, in marketing metrics that count leads, and in customer success graphs that count the number of active tickets for important customers. You might also see comparison or snapshot charts that compare period-over-period data for metrics, as in a week-over-week or month-over-month comparison.
Here’s the problem. When you see a change in an important metric, it’s often very hard to see why it changed and what you might have been thinking at the time. But there’s a key problem in trying to analyze most metrics in a typical system: you don’t easily see the history of past values. Adding to that, even when you do see the history of past values you don’t have the proper context to what you were thinking at the time.
Turning Data into a Data Layer
Ok, so you have a point-in-time problem. When you look at the metric today you don’t have a good view of how it changed. If you are lucky enough to have a time series of your data you can at least see how things have moved through time, but you don’t know what else what changing simultaneously. One way of expanding your view from a single point in time – the current view of your metrics – to a continuous stream of time is to create a metadata layer around your data.
It sounds fancy, but it boils down to a few truths about your data:
What is the structure of the data? - what are the fields that make up an “opportunity”, for example, and how do you store them
How does that structure change over time? Is there a decision internally that happens to record the meaning of a stage name or to change your metrics when those definitions change?
What is that data layer related to? What entity does this represent and what other items does it reference? In the example of an opportunity, we’re tracking a sale, but we also are referencing the company that is in the sales cycle, contacts who are at that company, and potential requests that are happening during that same cycle.
When the data changes, what do we know and what gets notified? Things go up and down with every metric. How do you know what’s a good change and what’s one that needs to trigger an alert or an alarm? When that data changes, how do other systems get notified?
What does a data layer mean to you? I believe it’s an essential idea to match the data you’re looking at now to the meaning of that data in your organization. By defining items in a data layer, you are clarifying what matters to your business.
Andy Mowat, CEO of Gated, does a great job of explaining the concept of using a data layer to learn more about the core metrics you see in a system and also to enable you to gain context about past metrics.
Listen from 06:30 to 07:45 in the clip below to hear Andy’s explanation:
What do you put in your Data Layer
Now, the inevitable question should be in your head: what data makes sense to put in your data layer? The answer can’t be nothing and probably shouldn’t be everything.
Here’s a heuristic for thinking about what to include:
The actual data. Duh. What do you want to track on an ongoing basis, and at what interval? This one’s pretty easy.
Some way to link notes or observations with a series of data. This means having a common way to reference a time stamp and a piece of data, so that you can later line up the notes you make on data with a series of data in a like time frame or time grain.
Descriptive metadata about each object - think of this as a “version” of an entity – indicating the time when things were added, modified, or changed. This is crucial for playing the part of data detective and investigating the view of past data given the lens of the current state of that data.
A concept of thresholds for important metrics, along with versioning for those metrics. What is “good” now may be “great” or “horrible” in the future, so it’s important to know not only that our measurement changed but that we made a decision and why we changed.
The data layer, then, viewed holistically, is a way to line up the data in your business with the data about your business so that you can assess change over time. It makes the data layer itself a perfect way to future-proof your time series data by providing a link to know what happened, when, and why.
What’s the takeaway? The solution to future-proofing your time series data is to create a way to track the observations about your time series data over time. By having these wayposts (think of them as guides on a map), you’ll be able to know where you’re going and where you’ve been. Measuring how long it takes will give you a sense of whether progress is expected or not.
Links for Reading and Sharing
These are links that caught my 👀
1/ What kind of visualizations should you build? - If you look around you’ll see lots of great ways to show data. But not every chart is right for every kind of data.
Thanks to Sara Dholakia, we’ve got a solid guide to help you understand how to match the data that you have with the dataviz you want to build. What I like particularly about this piece is that is goes back to basics (data, audience, impact, presentation method) to help you choose to present the information, but does it it in an elegant way.
2/ How many people out there build data catalogs? - SeattleDataGuy has a really good write-up of the tech engineering organizations actually use today when building a data stack. (SPOILER: there aren’t as many data catalogs out there as you think, especially in smaller companies.)
3/ [unintelligible mumbling] - this Vox explainer addresses the burning question of why all movie and TV dialog has become so difficult to understand, even if your hearing is relatively good.
What to do next
Hit reply if you’ve got links to share, data stories, or want to say hello.
The next big thing always starts out being dismissed as a “toy.” - Chris Dixon