Discover more from Data Operations
What is a graph database?
"Everything Starts Out Looking Like a Toy" (No. 52)
This week’s toy: print some stuff like it’s 1984 and you are using the Print Shop on an Apple IIe. Keeping old software alive is an interesting problem. Few people in the world have access to a working Apple IIe much less those 5.25” floppies. There’s a lot of decent software left behind. Maybe browser-based emulation (along with virtual PCs in the cloud) will be our version of computer museums in the future. Edition No. 52 of this newsletter is here - it’s July 18, 2021.
The Big Idea
What is a graph database? And why should you care? Graph databases are a method of representing the relationships within things in a very different way that you typically might do this. Instead of representing things as rows in a database, Graph databases use math to tell you the “nearness” of one thing to another.
Here’s a great introduction to the idea of a graph database:
Relational Databases connect data by tables
Most of the databases you use look like table-based databases, meaning that they persist information in a table, like a PEOPLE table. They retrieve information using a highly structured query language (SQL) that tells the database how to retrieve the information you need.
An example query might look like:
SELECT count(ID) from PEOPLE where STATE_ABBR=’WA’
to select the count of rows in the people table where the state value is the abbreviation for Washington state. But you might run into trouble when finding associated things to those people, like a list of all of the marketing campaigns they’ve engaged with before they started to buy things from your company.
Here’s where graph databases can help. They connect data to other data by looking at the similarities between records.
Graph Databases connect data by relatedness
What exactly are we talking about? Let’s think about the kind of questions that you typically ask of relational databases. It’s typically a strongly typed thing, like:
how many items do we have in the database?
show me the set of things matching a structured query, like “accounts in Massachusetts”
Relational databases have a hard time answering questions like "which actor starred with someone else who also starred with Kevin Bacon in a movie.”? You might know this as the 6 Degrees of Kevin Bacon game. The idea is that people in the world are quite interrelated and that it doesn’t take too many jumps to get from one thing (a person or node) to another (a relationship or edge). Relational databases do not do this sort of search well.
Graph Databases allow flexible questions
Instead of strongly typed relationships, Graph databases allow you to ask loosely typed questions and get semi-ambiguous relationships. Relational databases are better at answering the kind of questions that have closed answers and direct relationships. Indirect relationships are difficult for relational databases to handle. You also might need to do things like copy your data to “helper tables” that physically copy and denormalize data.
A flexible question looks a bit different when you are asking questions of your data.
How many people who looked at our marketing campaign last week are also people who belong to this customer group and also have tried our product?
A graph allows you to map this new question (that you haven’t heard before) with the known pieces of data to produce a result telling you:
how close is this record to the optimal record that would answer your question
how inter-related are the answers to each other
Graph databases build flexible result sets based on the way you are answering the question. They also do this while returning results at a fairly constant speed while scaling the dataset dramatically. Graph databases are not a panacea, and are powerful tools to evaluate the relationships between a lot of records. You can also do dumb things with them and they are not the solution for every database problem.
What’s the takeaway? Graph databases let you visualize and related data even if you don’t know the questions you are going to ask yet. This could be ideal for messy data relationships and iterative data queries.
A Thread from This Week
Twitter is an amazing source of long-form writing, and it’s easy to miss the threads people are talking about.
This week’s thread: on improving communication by using Amazon’s tips for writing
Links for Reading and Sharing
These are links that caught my eye.
1/ On the benefits of persisting - We often worry about planning, OKRs, goals, and the like. One of the most important inputs is doing something toward your goal every day. Here’s an example of the compounding value of showing up. Not everyone can afford to show up when not earning an income, and we all can take to heart the message that showing up does generate benefits. So it’s worth it to read this one even if you are not a solopreneur or a founder.
2/ Robots are learning how to walk like humans - A Facebook AI team, along with university researchers, is teaching robots how to adapt to uneven terrain. The implications are staggering if they get this right. If machines can do repetitive tasks out in the world without human input, the whole economy will shift dramatically. One wonders if gig workers will be tasked for “last mile” work only, or if the whole “robots can do stuff” results is a lot farther from reality than we think.
3/ Audio technology fades into the woodwork - IKEA and Sonos have been working on several different collaborations under the Symfonix brand. The newest speaker is (almost) a picture frame. What’s cool about this is that high-quality speaker technology is getting more affordable, and becoming integrated into the design of houses.
On the Reading/Watching List
What’s going to happen with the economy over the next couple of years? I have no idea. But looking at past booms is an interesting way to compare today’s climate with Crypto, NFTs, and other trends with past booms. What Happened - a documentary about the Dot-Com Boom from 1995-2001 - is a fascinating window into the history of that time. History does not repeat itself, but it often rhymes. Ask me in 2041 how it turned out ;)
What’s most annoying about B2B buying? For me, it’s either SDRs that have no idea what your real need is or the inability to get pricing when you want to know if that software is in or out of budget range. I’m digging into this research from PeerSignal.
Here’s an example of what they found in 1,031 responses from B2B software buyers when asking what is the most annoying thing.
What to do next
Hit reply if you’ve got links to share, data stories, or want to say hello.
I’m grateful you read this far. Thank you. If you found this useful, consider sharing with a friend.
The next big thing always starts out being dismissed as a “toy.” - Chris Dixon