In the last few years, there’s been a noticeable shift at cutting edge organizations in how data teams are structured. No longer is data engineering and data warehousing a backwater of the software engineering team. No longer is data science walled off from the data engineers who control the data needed to build models. No longer are data analysts at the mercy of a long development cycle before they have access to a new metric in their visualization tool.
The biggest change however is that the makeup and organization of data teams is being designed, not left to chance.
Three Key Functions
A “data team” is an overly broad term I know! To be more specific, I believe that there are 3 functions that make up an ideal data organization:
- Data Engineering
- Data Science
- Data Analysis
Some brief definitions –
The Data Engineering team is tasked with gathering data from each source, ensuring its validity, and delivering it to data warehouses, data lakes and other sources used by the Data Science and Data Analysis teams.
The Data Science team uses the data provided by Data Engineering to make predictions, run statistical analysis and build models to power products such as recommendation engines and personalization on a website.
The Data Analysis team uses the data provided by Data Engineering to answer questions about what has already happened, and uncover insights that help the business make decisions and inform customers.
Depending on the size and needs of a given company these teams may exist under different names or forms. For example, data analysis will be siloed in some, but in others it’s embedded across departments. However, as I study and work with great organizations I find they all have, or are converging on, a structure that supports all three functions and their ability to efficiently interact.
Because there’s no one-size-fits-all design of a data team, I find it helpful to learn from real examples. Thankfully more teams are publicly sharing their team designs and insight into their data strategy. Here are some of my favorites.
A recent post on the dbt Blog profiles the design and strategy of 7 different data organizations. It includes some great companies like Hubspot, Away Travel and more.
Dimitri Masin, the Head of Analytics & Data at Monzo writes in detail about building his data team from 1 to 30 people. It’s a rare look at a data team growing from the beginning in great detail. He not only shares his journey but his decision making along the way as well.
Also from the dbt Blog, Sagar Velagala (Operations Manager at Lola.com) describes how he supports the entire company of about 100 people as a single person analytics team! Though an extreme example in my opinion, it’s incredible to see how he leverages products such as Snowflake, DBT, Stitch and Looker to enable himself and others in the organization to execute a clear data strategy.
As you read through each, you’ll notice that the differences in the design of the teams are based on the realities of each business and what tradeoffs the data leader is willing to make. In designing a team, there’s a lot to learn from how similar companies have done it.
The “Absolute Musts”
Though the design of a data team varies, there are a few constants that I feel are most important –
Don’t invest in data science before you have data engineering in place
I wrote a more in-depth post about this, but in summary without the data they need to do their work data scientists will struggle to produce any real value. The same is also true for a Data Analysis team. Data is their oxygen.
Sound obvious? Sure, but it’s a mistake that’s repeated often. Just like leaving data team design to chance, many organizations assume that someone in IT or Engineering will be able to get data scientists and analysts what they need. That assumption is wrong.
Assign a leader of your data strategy and give them true authority
Just as leaving data team design to chance will leave you in a pinch, so will taking on the challenge without a leader who has both vision as well as authority. Making data strategy a portion of another leader’s job means that it won’t get the attention it deserves.
As far as vision, choose a leader who has experience but is able to apply their vision to the specific needs of the origination. Can they adapt their vision to a startup if they came from a larger organization? Will they invest in data science and machine learning before the rest of the organization is ready? As with any hire, fit is important.
Authority is also key. If they don’t have a seat at the table alongside the leaders of the Engineering, Marketing and other functions, their strategy is likely to play second fiddle as well. If you’re serious about making data a driver for your business, then empower its leader.
Create an explicit line item in your budget for the data team
If you don’t already have a data team, it’s hard to swallow that the additional expense is necessary. However, don’t assume that you can get by on a shoestring budget for data. It’s not just about how much you spend, but rather being explicit and honest about your budget. Set expectations on budget with the leader of the data organization and then let them move forward. Don’t make them fight and claw for a piece of someone else’s budget.
It’s been fascinating to see data organizations mature over the years. I’m delighted that data engineering, data science and analytics are no longer just buzzwords, but often their own departments with VP and even C-level leaders. The landscape will continue to evolve, and that’s a good thing.
If you haven’t already, please consider signing up for the Data Liftoff mailing list to get more content and to stay up to date on the latest in data science and data engineering.
Cover image courtesy of https://pixabay.com/de/users/geralt-9301/