The next data bottleneck
What clever or crazy data questions do people ask for when there’s nothing to stop them from asking?
Nearly a decade ago, a book called Everyone Lies was published. Written by a former Google data scientist, it discusses the power of Big Data to reveal who people actually are, as opposed to who they present themselves as, with a particular focus on digging into search engine query data. The book feels of a different era now, but at the time the analysis was more surprising. It suggested that when people can type anything they want into a text box with no consequences to their words, they act a bit… uncivilized. From fuming about their prejudices to spiraling about medical anxieties to indulging their fetishes, the contents of a query stream tell us that social-desirability bias has not given us an accurate impression of what’s going on in the average person’s head.
In short, what people say want and what they actually do are rarely the same.
You see plenty of examples of this as a data professional. There are too many examples to count of high value customers or power users swearing they’ll churn if a company doesn’t do something, only to retain and exhibit the exact same patterns of usage as before. Or in another tale as old as tech, an inexperienced product manager interviews customers about something they want to build, gets them to say “yeah sure I guess that sounds nice”, and then spends a whole quarter building something that those same users don’t touch after it’s shipped. My own career has even had me spelunking in search engine query streams where I saw for myself the yawning gulf between the pleasant things users post publicly and the less savory things they seek out at 2am.
Unsurprisingly, this same gap between what people say they want and what they do applies to our colleagues as much as it does to our companies’ customers and users. For a long time, this was something you could only observe in slow motion, when someone created a fire drill about some analysis, dashboard or data pull that they ended up not using, or at best pulled out once in a purely defensive maneuver to prove to their boss that something Wasn’t Their Fault. A more dramatic version might have come in the form of some senior leader insisting their team was going to be data-driven from now on, enlisting your help to ensure everyone had their own driver metrics that perfectly cascaded down from the organization’s north star, but then never following through once said senior leader told a good story about the transformation they’d lead.
This is all to say, a non-trivial amount of data usage within most companies is posturing. Folks say they want to be data-driven or data-informed because it makes them feel smart and strategic, and whether its a conscious cynical thing or simply social pressure, there has been little in the way of evidence to contradict their self-reported behavior. A motivated member of the data-fetching class might dig into query logs of their data warehouse to study their colleagues’ actual usage patterns, but beyond that most of what has existed is view counts in BI tools or summaries of tickets flowing through a help-desk-style intake process. These things can be informative but they still feel very much like reading tea leaves. And a data professional hardly sounds like a neutral party when they claim someone’s data usage isn’t up to snuff.
We’re entering an interesting new era, though, with analytics agents coming on to the scene1. Whether we actually achieve self-service this time or not, these agents promise to make accessing and using data more similar the unfettered experience of typing a question into the Google search query box. Stakeholders’ friction is substantially reduced in asking those much-dreaded qq’s to pull data, and while anonymity might not be guaranteed, not having to work through a harried data professional or rigid intake process changes the dynamic quite a bit. People are asking more questions and a wider variety of functional roles are asking them to boot. The query stream for an analytics chatbot is a tantalizing dataset. What clever or crazy data questions do people ask for when there’s nothing to stop them from asking?
It turns out it’s mostly “can you pull XYZ data for me?”
If you take people’s data posturing at face value, this is surprising to see. So much has been said about how data is essential, a strategic differentiator when deployed toward the right ends. So many people have complained that they would get more out of data teams if they just focused on insights or storytelling, not building pipelines and making dashboards. When these people suddenly find themselves with unimpeded access to data, why are they just asking questions that could be answered with, well, a dashboard? Where are all the important business questions they’ve supposedly been blocked from asking up until this point?
There’s certainly not a single reason for this behavior, but I have a few theories.
The first: for most people, data plays a purely supportive and operational role in their day-to-day work. At any given moment, someone’s priorities are probably already set, shaped more heavily by their functional role or their boss than any chart they they looked at. A product manager needs to write a spec. A salesperson needs to meet and pitch a prospect. An FP&A manager needs to build a revenue forecast to make sure growth is on track. Data might be a key component of all of those tasks, but it’s only a component. Easy retrieval via a natural language interface is a huge time saver for them and it may even shape what the direction they take, but it’s generally going to be something they plug into an existing procedure or thought process rather than something that’s applied in a novel or creative way.
The second: finding the right data is hard and a chat makes it a lot easier. This is related to the previous point, but knowing what data exists and where it lives is difficult enough when you spend all day thinking about it. The ability to determine what data is relevant for a question or task is a genuine skill, hard-won by many data professionals through years of wading through messy data and working with stakeholders to refine a vague question into something quantifiable. If you layer in the fact that data is a mirror of a company’s operations and that companies’ operations are rarely perfectly orderly, asking the average person to keep up with your ever-changing data is not reasonable. In a world where so many pieces of information can be retrieved with a natural language query, it’s silly for data to be the one place that’s not true. Asking a chatbot for data is the superior experience since removes the need for any specialized knowledge and it’s much, much faster.
The third: there are people for whom text is the best medium for consuming data. This one might feel weird when data visualization is as ubiquitous as it is, but enlightened data professionals have had an inkling of this for a long time. Why else would the advice to include an executive summary on top of your analysis be so common? Some data folks get grumpy about the rest of their work going unremarked-upon, but distillation in bullet points is a pragmatic acknowledgement of the modern knowledge worker’s fragmented attention. Even for people comfortable with interpreting data, the cognitive load of translating between charts and stats to a plainly stated conclusion is too much to fit into the cracks of their work day, and this is especially true for a busy executive who may have spent their whole day hopping between meetings being bombarded by information and having to make snap judgments about it. Whether you could do the interpretation yourself or not, there are many situations where an agent translating a data request into something quickly digestible is preferable.
Now some data folks might read the above three theories and feel anxiety, frustration, despair or something else equally negative. Even though you’d be hard pressed to find anyone who says they love being a designated data fetcher, it’s been a lucrative niche and many people have built well-developed professional identities around it. The idea that a chatbot might be better at our jobs than we are is hard on the ego, not to mention nerve wracking when so many companies are cutting jobs and trying to keep teams lean. But I don’t think this behavioral pattern of folks mostly asking analytics agents to pull data for them means there’s no future for data work (beyond, of course, configuring the chatbots and curating their context). My fourth theory for why people mostly ask for data pulls is that a lot of people don’t know what to do with data once they have it.
An under-appreciated manifestation of this phenomenon is the stakeholder who requests their data counterpart to provide them with “proactive insights”. The first time you’re asked to generate these, it can feel like you’ve hit the jackpot. Suddenly you’ve got the business’s blessing to go think creatively, to do some real exploratory analysis and come back with an influential finding. But you only need to go through a cycle or two of this to realize it’s a hard needle to thread—knowing what’s actually useful for your stakeholder requires a real understanding of the problems they’re grappling with, and you rarely have the experience with their domain to suggest something to them that they haven’t already thought of. Unfortunately, trying to get in front of this by asking the stakeholder for an idea (even a broad one) of where they could use your help digging tends not to work; if they knew what they wanted, they’d ask more directly.
The root cause of this, I expect, is that pesky gap between what people say they want and what they actually do. People love big flashy stories of insights from data changing the direction of their business, like using triadic closure to grow the LinkedIn social graph, but most people just want clarity about where to look next. The same people who ask for proactive insights are also usually happy with a barebones qualification metric that filters out noise for sales reps struggling to keep up with a high volume of free trials, or with a simple product analytics funnel that helps them understand the copy on a button that completes a conversion flow is ambiguous. Being impactful does not require being a revolutionary, just identifying tractable problems and getting the right people working on them.
I think this is the next big bottleneck in the data world: recognizing situations where data can be genuinely clarifying and knowing how to get it into the right state to make it a problem-solving tool2. AI will certainly play a role in that, but it’s as much of a tool to be wielded by observant and solutions-oriented people as the data itself.
For a data professional, this is no time to fall asleep at the wheel. It’s your game to lose. New technology is lowering the barrier to entry on your specialty, but when the boundaries between different jobs are so fluid, there’s no reason why you shouldn’t feel entitled to yolo your way into other folks’ areas of expertise. I promise they’re not any smarter than you, and while you may not know what to do next either, creating clarity, even imperfect clarity, will always be valued.
I should note that while I am gainfully employed by a company building analytics agents as one of its core offerings, I am not speaking on their behalf in this post.
Hand in hand with this is recognizing situations where data is not useful and gracefully removing yourself from them.


I can relate to most of what you’re saying.
Back in the 90s I worked for a company called BusinessObjects which, for those that don’t know, was an ad-hoc query tool. It was aimed at allowing the business user to run their own queries and not need to use standard reports and dashboards developed by a central data team.
I remember an early sale to British Steel which when they got their hands on the software, they discovered that they were using argon twice in the production line (at least that how I remember it). They where able to play around with the data rather than being reliant on the data team to build reports for them.
I think the world has lost its way a little as we seem to be going back to the old days of centralised data teams producing reports … it’s a shame.