A Librarian Walks Into a [Data] Warehouse...

Introduction

What is an analytics engineer?

I’ve spent my career traversing the data stack. Chasing business metrics as an analyst, writing pipelines as a data engineer, and eventually leading the teams trying to make those two sides talk to each other. Out of everything I’ve done, analytics engineering is the area I’ve grown to feel most at home. When asked about my job, if I tell people I do “data engineering,” they assume I manage servers or do database administration all day (wrong). If I say “data analytics,” they assume I make pivot tables in Excel (wrong). Analytics engineering sits in the messy, high-stakes void between those two worlds and… “the business”.

The job is only fractionally about writing code, and it isn’t so much about directly answering business questions. It’s listening to what people are pursuing, where they’re running into friction, picking up on things between direct questions. The tools they’re using, questions they have, connections they’re missing. From the smaller pieces I’m building the puzzle, piecing together the semantic foundation of the organization or problem space.

Analytics Engineering is a relatively young field, and like all other technical roles is under discussion as to what the long-term impact of AI will be on it. Good data is gasoline for AI, though. With the current gold rush demanding pristine, well-structured data to train on, analytics engineering is not slowing down anytime soon. What I don’t want to do is bore you with technical jargon or the endless tools you might find yourself having to learn and adapt to. With that in mind, let’s talk about the history of the role, what the day-to-day looks like, the real skills required, and the highly specific headaches that come with the role regardless of domain or tech stack.

Three Jobs in a Trench Coat

mindmap
  root((🐙 The Analytics Engineer))
    🏗️ Modeling
      Picking a structure you won't regret in two quarters
      Tracking history without rewriting it
      Conforming dimensions across six warring data marts
      One definition of "customer" that survives a reorg
    🔧 Engineering
      Version control, because hope is not a strategy
      Tests for the bugs you can already smell
      Pipelines that survive Mercury retrograde
    🎯 Product
      Turning "we need a churn dashboard" into a real metric spec
      Saying "no" without making enemies
      Hallway requirements vs. real ones
    🕵️ Detective Work
      Why is Q3 off by exactly $14,000
      Tracing a metric through six dashboards
      Finding the one analyst who knows

The work itself had been forming for a few years before manifesting the role: cloud warehouses got cheap, SaaS ingestion tools handled the plumbing, SQL literacy spread, and in the midst of it all “the modern data stack” was born. Tristan Handy at dbt Labs coined the name for the role in 2019, describing it as the missing seat at the table between data engineers and data analysts. That framing has aged pretty well, but in practice the job is even messier than he suggests. The career manifested because someone had to sit between the infrastructure and the insight to govern consistency (hello, data contracts). Some would tell you “it’s a made-up title”… to that I say “all titles are made up titles”.

The analytics engineer sits at the crossroads of data modeling, software engineering, and product ownership.

It’s the ability to envision and drive forward data products at that intersection that makes an analytics engineer great. Inevitably, the role morphs to fit whatever gap the org has. But at its best, when data modeling, governance, cross-team negotiation, and engineering are all in harmony, it’s a different job entirely from any one of those pieces on their own.

Tesseracts, Not Cubes

Sorry, Kimball. Strict relational data modeling is so last century.

In a world that demands instant answers with split-second attention spans, you model with directness and anticipation. You come to the table with a vision ready to paint your surroundings, and that picture has to change based on who’s looking at it. You’re taking raw, garbage data and the same “ambiguous” requirements you get asked about in interviews (but for real) and forcing it into structures that can serve a dozen different use cases tracing back to the same root story. You see the forest for the trees, and know it’s home to many species.

The question of today is the tip of the iceberg: if one person asked it now, ten more will ask it tomorrow, and someone six months from now will ask a version of it about a business context that doesn’t exist yet, in a market the company hasn’t entered. The model has to have a clean answer waiting, or at minimum, a clear path to one that doesn’t slow down decision making.

Semantic modeling is the practice of encoding business concepts (metrics, dimensions, entities, relationships) into a shared, governed layer that maps plain language to technical implementations, and the structures that hold those definitions in place. It’s a core skill of analytics engineers. “Revenue” has to mean the same thing whether Finance is closing the books, Sales is forecasting their next quarter, Marketing is calculating CAC payback, or the Product team is checking whether a feature is worth time investment.

I’ve come to think of what I’m building less as data “cubes” and more as data “tesseracts”: structures that hold their shape across more dimensions than you can comfortably visualize at once.

This goes well beyond standard dimensions, but picture it: time, business unit, customer segment, product surface, fiscal calendar versus calendar year, all rotating around a single definition without distorting it, served to whatever system needs it at the latency and in the format it’s needed. If you get this wrong, you don’t only have ten dashboards that disagree, you have ten teams making decisions in ten different realities, and maybe even customer-facing mishaps. It takes art and discipline to let data flex without letting it fracture.

You’re not the one deciding what the data means. You’re the one writing it down and holding people to a standard systemically (because no one wants to be the office pedant). When someone walks up and asks “How do we measure retention?” or “What counts as an active user?”, they’re not asking a technical question. They’re asking a semantic question (maybe even a political question), and the answer probably already exists in fragments scattered across the organization. The job is to be the scribe who pulls those fragments together, the mediator who gets product and bizdev to agree on a single definition of “active user”, and the engineer who implements the governance layer that keeps that definition consistent once it ships.

Standing on Business

pie title Analytics Engineering Day-in-the-Life
    "Translating panic into the actual question" : 30
    "Chasing down rogue numbers before they spread" : 25
    "Ownership and 'who maintains this' conversations" : 20
    "Writing the spec, contract, and definition down" : 15
    "Modeling the thing once everyone finally agrees" : 10

Analytics engineers build data products. This amounts to treating data as an asset where outputs outlive the conversation that created them, get used by people who will never know who built it, and have to hold up under scrutiny.

A large chunk of making that successful is fielding product managers, or half-playing one. Figuring out what people need to run the business instead of what they frantically asked for in the hallway or demanded in a midnight ping. That means holding the line when a request doesn’t match the underlying question, and translating panic into something the data can honestly answer. Done right, it’s the horizontal bar of the ‘T’ shaped professional: a generalist willing to get in the trenches, unfazed by moving targets, and stubborn enough to keep asking “but what action or outcome are we hoping to drive?” to every person they encounter.

I cannot tell you how many times someone on an ops team has run a quick query against a production replica, gotten a number, dropped it in an exec slide, and by Thursday it’s the number. Three teams are already planning around it before I even hear about it. The query joined on the wrong key. The number measures the wrong thing. But it’s in a deck now, so it’s real. A huge part of this job is catching those before they harden into gospel, and building something trustworthy enough that people reach for it instead of the quick query next time.

And even when the numbers are right, nobody can agree on who’s responsible for keeping them that way. The 2026 dbt State of Analytics Engineering report found 41% of teams still cite ambiguous data ownership as a top challenge (unchanged from the year before) while 83% call trust in data a top strategic priority and poor data quality remains the field’s single biggest obstacle. The field has had years of better tooling and better warehouses. Ownership is still a mess. Analytics engineers aren’t the deciders (I will keep saying this), but you’re usually the one facilitating the conversation, doing the research, and holding the definition steady while someone else calls the ultimate shots about what matters most to which part of the organization.

Without someone holding the middle, every team builds their own version of the truth in a vacuum. You make the canonical answer easier to reach than the wrong one, defend the definition when someone wants to bend it for their slide deck, and do it in a way people don’t take personally (smile!). Done well, the asks slow down because the answer is already sitting there waiting for people to use at scale.

Walking a Crooked Mile

Few grow up knowingly aspiring for a career in data (fun fact: I stumbled into SQL around 12 trying to maintain a database for a text based MUD). The reality is most people fall into analytics engineering specifically on the road to something else. You chase business questions until you need to learn SQL, learn SQL until you need to automate something, automate until people start depending on the outputs, and then you’re in a room explaining to a VP why their dashboard and their analyst’s spreadsheet don’t agree. At some point you realize the long and windy road was a career trajectory all along.

The shape of Analytics Engineer changes depending on where in an organization you live. At a startup, the analytics engineer might be the entire data function: modeling, pipelines, governance, and the person explaining to the CEO why their favorite metric is lying to them. At an enterprise, the role narrows: semantic layer person, data contracts person, the one who owns lineage. The headaches are different, the skills are the same.

As an analytics engineer, I’m a logical-librarian in a machine. I don’t own the stories the data tells, and I don’t own the needs people have for it. I decide where things live, what they’re called, who gets to change them, and how the next person walking in finds the right shelf. And like any good librarian, I’m caretaking a living garden. I spend a large chunk of my time weeding: retiring the models that stopped getting queried, pruning the definitions that drifted, deprecating the table someone built two years ago that’s still in production (for reasons no one can explain). I write it down, defend the definition, and (if I’m lucky) I don’t have to argue about it again for at least six months.

Now, I’ve got a model to migrate and a definition to defend.

Have your own thoughts on analytics engineering you’d like to talk about?

Do you run a conference you’re interested in booking me to speak at?

Reach out on LinkedIn!