This week’s toy: drawing an iceberg and seeing how it floats. It’s not going to solve Global Warming for sure, but might get us thinking about the icebergs in our midst. Edition No. 35 of this newsletter is here - it’s Feb 27, 2021.
The Big Idea
Ci/CD (or Continuous Integration/Continuous Delivery) is an idea that has thrived in software development since the early 1990s. Originally conceived as a way to minimize defects by merging software developers’ work more often, it’s now entering into other workflows in the workplace. Most specifically: in the spreadsheet.
Not for me, I only build documents
In the olden days of the spreadsheet (perhaps you remember Lotus 1-2-3 or Borland), culminating in the heyday of Excel templates that launched a thousand financial models, the idea of creating a non-networked document was the key for a spreadsheet. Its innovation was taking calculations that previously needed a programmer and building syntax and formulas that could be used by the average knowledge worker. Excel was the original no-code solution ;)
The basic model of a spreadsheet is work that might be shared in a business group as a template, or use a common formula to solve similar business problems. If you’ve ever tried to look on Stack Overflow for a formula to identify the text in an HTML tag and separate it into separate columns, or to do a VLOOKUP from one spreadsheet tab to another, you know what I mean. However, things are now changing fast.
What does CI/CD mean for spreadsheets?
Taken literally, we need to share spreadsheets with other people in our teams. We need to have templated solutions for many different kinds of problems. And many of these recipes are probably relevant for many different spreadsheet users (even those outside of our company). Yet today you can’t just share your models with other people because the data in the underlying model is often proprietary or secret.
Continuously improving the spreadsheet implies that we are on the verge of creating a data pipeline to separate the calculations and schema and logic of a spreadsheet from the underlying data.
For example, if you were able to do the following:
create an enriched sales account list from a name of a company and a website
identify whether an address is correct according to the US Postal Service
build a dashboard from weekly and monthly standard sales data in your CRM
Your peers might think you have analytical superpowers. Yet all of these solutions already exist today with a little bit of integration. Add a function here and a plugin there and your lowly spreadsheet can be special.
Continuous Delivery of Innovation
In this context, continuous delivery means improving your standard order process by making your spreadsheets and models more accessible to other people in your organization, and by abstracting them so they can be shared more broadly. What would need to happen for this to be possible?
functions in your spreadsheet (whether driven by lambda excel functions or named ranges in Google Sheet) need to be abstracted to automatically show which libraries they require, and automatically prompt you to load those libraries (from a public, signed location)
spreadsheets need to have a “how the heck does this happen” page that explains the logic of the model to normal humans. Ideally, a team will build grammar and logic to auto-generate these explanatory models (JavaDoc for spreadsheets?) so that every time the model is updated, the documentation is updated too (a guy can dream).
the model needs to describe the schema of the data it needs, from the objects (e.g. CRM records with the following minimum data fields and types, related with a common ID or name or value)
that model, schema, and documentation need to be bundled up into an observable package that can be signed and verified by a developer.
This is starting to sound a lot like delivering software directly by Spreadsheet. An interesting idea.
What’s the takeaway? Your spreadsheet is actually a collection of ideas that forms software. Adopting the practices of traditional software will supercharge your process development, from understanding the needed inputs and outputs to being able to teach someone how it works.
A Thread from This Week
Twitter is an amazing source of long-form writing, and it’s easy to miss the threads people are talking about.
This week’s thread: Josh Elman on building a business a little too early…
Links for Reading and Sharing
These are links that caught my eye.
1/ Seeing is believing - A data pipeline is a record of the steps you took to transport, ingest, clean, and remediate the data. How the heck do you show that? Nicholas Chammas does an excellent job demonstrating the progression of original data into a materialized view. This is otherwise known as “show your work.”
2/ Integration is eating the world - when you look at the growth pattern of Salesforce.com, the thing to notice is the acceleration of growth coming from newer clouds. Mulesoft, Force.com, and Tableau all depend upon and create a secondary ecosystem based on data. (hat tip to @edsim for highlighting this)
3/ Treat yourself … to a list of what you need - As we go into month 13 of this Pandemic, it’s necessary to measure whether you are doing a good job taking care of yourself. No, I’m not talking about a marshmallow test - I’m referring to the practical and tactical steps to make sure you don’t burn out. This excellent article has some suggestions you can try today.
On the Reading/Watching List
As a data analyst, I keep thinking about how to make data science more interesting and palatable. I am pleased to see that someone wrote a book for me ;) Andrew Carr has written a book on Everyday Data Science. Since I’m a little bit old-fashioned, I bought a real printed version. Here’s to showing your data work more accurately!
I’ve written about Wandavision before, and it’s still on my favorite list. This is some of the best work the Marvel Studios folk have done in a long time, and it’s so creative. Almost any clip I could post here has spoilers, so just go watch it.
What to do next
Hit reply if you’ve got links to share, data stories, or want to say hello.
I’m grateful you read this far. Thank you. If you found this useful, consider sharing it with a friend.
Want more essays? Read on Data Operations or other writings at gregmeyer.com.
The next big thing always starts out being dismissed as a “toy.” - Chris Dixon