You sit down at your computer for a day of work. It’s Wednesday, the company-wide no meeting day at the startup where you lead a data team, and even though you do still have one meeting on calendar, it’s only an interview debrief. You open up slack, hoping this might be a day where you actually have a little time to think for a change.
You already have 50 DMs. You live on the US west coast so it’s not uncommon for you to wake up with a dozen or so from folks further east who have already started their day, but this many means there’s probably something going on. It turns out you’ve been tagged into a thread in the operations team’s slack channel. They just had a big meeting where they were discussing business metrics and everyone was feeling to motivated to spend more time immersed in them. You smile at this as you scroll through the thread, pleased with the depth and nuance of their discussion, but then you see where your name was introduced to the thread. The manager of the team suggested everyone should set the goal of learning SQL by end of quarter and @’d you to ask if you could teach a class or recommend any online courses.
You’d love for folks to be able to self-serve more with SQL but then think nervously about your data warehouse. It’s in a relatively good state but still has a confusing blend of modeled dimension and fact tables, and far more unmodeled clickstream data, replicated application databases, and data from a dozen or so SaaS products brought in by an ingestion tool. There are so many time grains, so many keys that seem like they should connect but don’t, so many entities defined in subtly different ways depending on where the data is coming from… Turning the ops team loose without guidance in your data warehouse feels risky.
Rather than say as much in the thread, you pull aside the manager in a DM and explain why simply learning SQL won’t be enough. You don’t want to leave them stranded so you offer to write some documentation on your most important tables and to start doing a weekly office hours to help them troubleshoot their queries. Before they have time to respond, though, you need to join your hiring huddle.
It’s for a junior analyst role on your team, one that you envision will be able to field many of the ad hoc questions you receive from your marketing team. You look through the scorecards and start to get excited. It looks like everyone loved this candidate! But then you see that the demand generation manager, who you included in the loop to make sure you had cross-functional buy-in, is a no. She summarizes her feedback in the huddle meeting and her comments make you think that she may not be well-calibrated on what it means for someone to be a junior analyst. You ask her to talk individually after the meeting.
“I’ll be honest,” she starts immediately after joining the call, “I think what I really need right now is someone who can follow instructions and knows Python.”
This is completely out of left field for you. “Python? Do you mind if I ask what you’re trying to use it for?”
She pulls up the notebook tool your company uses, where she has some boilerplate code for fitting a regression. “I was reading about media mix models,” she says, “and I thought, this technique is so powerful, why not apply it to all the campaigns I’m managing to figure out why they’re working?” She goes on to explain that the library she’s using keeps throwing errors that the notebook’s built in AI copilot isn’t able to fix, then scrolls through the notebook to show you.
She is moving quickly but you see enough to know what’s wrong. “Your errors are actually coming from violating the assumptions of this type of model,” you tell her, and then give her a high level explanation of how regressions work as you silently thank the libraries authors for building in these checks. “You don’t need someone who knows Python,” you conclude, “you need someone who knows statistics.”
Although she’s skeptical, you work through the model specification with her a bit and get a version of it running without errors, although not performing well by any means. She’s ultimately persuaded and you tell her to direct future stats questions to the marketing scientist on your team. Then you steer the conversation back to the interview debrief so you can talk about how a junior analyst is different from a senior analyst. You both leave relatively happy.
You return to slack and your notifications are blowing up again, this time because of a group DM with you, a newly hired product manager and his boss. Their group is collectively responsible for your site’s checkout experience, and they’re rehashing evidence and theories for why conversion rate has been declining since February.
You sigh. They’ve been at this for almost a month. This new product manager got put on this investigation in his second week, and while you were grateful your team didn’t get pulled in too deep, you’ve also been nominally helping this PM by pointing him to existing dashboards and helping him find the right events to construct funnels in your product analytics tool. There haven’t been any obvious smoking guns and these two’s questions have been getting increasingly and irrelevantly niche. You feel like they’ve been spinning their wheels, so you finally suggest to them that data probably can’t answer this question. The conversation pauses for a moment, but when they start up again they seem receptive to the sentiment, if not relieved. They’d been feeling that way themselves but felt like if they’d suggested they’d reached a dead end, it would look like they’d given up and they’re grateful to have you validate that perspective.
You get through all of this and your no meeting day is more than halfway through. You’re a little frustrated but you’ll still take it—at least you now have a few hours to yourself. Maybe it’s time for you to turn off notifications…
“These days, everyone’s a data professional”
It’s hard to argue against the idea that data is a part of most tech jobs these days. Especially for folks in senior and leadership roles, they’re expected to have a quantitative story about how their area is doing and a command of the handful of metrics that help them build that story. Some folks observe this trend and make a stronger claim—working with and interpreting data is a part of so many people’s jobs these days that we’ve reached the point where everyone is a data professional. Perhaps data is a skill that other folks have and shouldn’t need to be a job in its own right.
I have my doubts that everyone who works with data is prepared to do so, but there’s something to be said for this perspective since technological advancement has a way of collapsing distinctions between roles. Generative AI is a topical example, but before that cloud data warehouses made it possible to do data engineering without the systems knowledge, scikit-learn made it possible to do machine learning without understanding the math, Tableau made it possible to build advanced data visualizations without having code them up yourselves, and so on and so forth. What feels like the core of a job today could very easily be a vendor or framework used by some other role tomorrow.
But asking whether data is a job or a skill feels a bit like it’s setting up a false dichotomy to me—doesn’t getting a job usually mean demonstrating you have the requisite skills to do that job?—or at least like it’s honing in on the wrong core skills to expect of someone whose entire job is data. Is a data professional’s job writing SQL, or is it knowing how to unify heterogenous data from multiple conflicting sources to get a useful representation of your business? Is it knowing the APIs of Python libraries, or is it understanding how to frame a problem in a way that allows you to apply a statistical method and get an answer that isn't falsely rigorous nonsense? Is it knowing how to find and pull data enough times that you finally get the answer to a seemingly unanswerable question, or is it knowing what types of questions quantitative data can actually be used to answer so you can stop digging before you’ve poured weeks of effort into them?
This is definitely a weird time for the tech industry, with mass layoffs and generative AI making everyone wonder what their job will look like in 5 years, if it continues to exist at all. As if trying not to be the slowest runner when getting chased by a bear, you see folks claiming that roles adjacent to theirs will either be automated away or else folded into what they do, or implying everyone else’s job was a zero-interest rate phenomenon. Once the consolidation is done, such folks argue, their role—the One True Job!—will be the only one left on the org chart.
Data professionals, perhaps by virtue of being adjacent to so many things and hearing so many folks claim data will be folded into their jobs, seem to be especially nihilistic about job consolidation. I can’t tell you the future anymore than your (least) favorite LinkedIn influencer, but at least for me, it feels like a positive sign to see so many people emphasize the necessity of data skills and scramble to learn them. In a world where tools reduce the barrier to entry, the higher level skills and understanding of when they’re relevant and how to use them becomes more differentiating. The way that those skills and understanding get grouped together into jobs will certainly change over time, but that doesn’t make them any more or less necessary. The thing that makes specialists worth hiring tends to be the size and complexity of the problems that specialists are good at solving.
Imagine a solo founder, running a company entirely on their own. They start by doing everything, but eventually this becomes infeasible simply due to time constraints. Some part of their job starts taking up enough time that they need to hire someone else to do it for them. Maybe they start off building a no-code product and decide to hire an engineer when they realize they need some more custom functionality. Or perhaps they’re an engineer themself and need to attract customers beyond their first few design partners, so they hire someone to write blog posts for a content marketing play.
In my own role running a data organization, recruiting is similar. I’ve built teams at both my previous jobs and my current one, and you’d better believe that’s built my skills at recruiting. I’ve gotten much better at writing job descriptions, writing and conducting interviews, and selling candidates. I’ve become familiar with job-specific tech and tools like Gem, LinkedIn Recruiting, and Kamsa. I’ve even run entire hiring loops myself in the past and could do it again if I needed to, but I would really prefer not to. Recruiting is a demanding job that requires specialized skills and lots of experience to do well. Even though (or perhaps because) hiring the right team is one of the most important parts of my job, I am grateful to partner with a specialist.
Roles split and fork in these ways as companies develop, driven by the size of need for certain skills. It only follows then, that the need for data jobs is driven by the data parts of other jobs becoming complex and time-consuming enough that they need to be spun out into a separate role. Tech and tooling advances change where that line is drawn, but it’s hard to imagine a world where the line disappears entirely.
I suspect the implicit fear of becoming irrelevant is what drives so many data professionals into a weirdly ahistorical spiral of self-loathing and self-flagellation. We seem to get new job titles every hype cycle or so, and while some of that is cynical hype-wave-riding, it’s also recognition of the fact that the way we work evolves quickly, even if the core tasks remain similar. Talk to enough grizzled veterans and you’ll hear that even though we’ve called ourselves statistical analysts, BI developers, data scientists, analytics engineers, and so forth, our industry has been solving the same sorts of problems for decades. They’ll grunt bitterly that whatever we call ourselves, we’re still stuck spinning our wheels, doing something truly sisyphean by trying to make companies “data driven”.
It does stand out to me that you don’t hear marketers talk about their skills and jobs this way. None of them are lamenting that customers still need to be made aware of all the great products and services that could solve their problems, even after all these years of the marketing techniques advancing and all these improvements in martech. Marketers know they have sets of skills that address a core need of businesses, and that the way they wield those skills has had to evolve as the industry and world did. Why shouldn’t we think of data skills, whatever job they’re grouped into, the same way?
I think there is way too much self-flagellation going on in the so-called "modern data stack" community. There is no doubt on the utility on the modern data stack. All that happened was that rates were kept too low for too long, and valuations were too high for too long, and they're not living up to those.
That's all. And it's not funny how poor ChatGPT is at data analysis - occasionally we see some value there and get scared. But I'm positive it's been trained on Kaggle answers, and you know the quality of the average answer there!
I like the Marketing comparison! Definitely not just an issue for data teams. While we get caught up in the specifics of our field, what we're wrestling with generalizes to: Tool makers will always try to make jobs easier, non-pros will always think they can make do with those tools, and pros need to explain the pitfalls that non-pros don't know to look for to justify their involvement.
In my first career as a UX Researcher, this same thing played out / is still playing out when companies say, "talking to customers is everyone's job." At the end of the day: Professional training is still going to be valuable. If you can look at this trend as an opportunity for scale by providing structure, you'll continue to be valued.