Katie Bauer Profile picture
Wrong but useful. Tweets about data work, careers and teams
Sep 10, 2023 9 tweets 2 min read
Over time I've started to think that data is not fundamentally a technology job, but rather a technologically-assisted job. People have been doing similar work for centuries, but with cruder methods and tools Modern data scientists and analysts operate at MUCH faster and larger scales than, say, 18th century demographers trying to extrapolate about populations to levy taxes on them. But analytics as it's practiced in modern businesses is still a similar discipline
Apr 28, 2023 14 tweets 3 min read
Over the years I’ve managed dozens of data scientists, analysts and analytics engineers across multiple companies, and it stands out to me that most common theme I see in feedback on data folks is that their stakeholders wish they would say what they thought more.

Yes, really. If you work in data, you are probably pretty comfortable looking at tables of metrics or charts and forming an opinion about what they mean. It doesn't matter if not you’ve had formal training in it—it's a daily part of data jobs and that familiarity means comfort with the task
Dec 9, 2022 11 tweets 2 min read
I don't like "enable better decision making" as the goal of data teams. It just doesn't do justice to all the other work that data teams need to do as a part of getting to that that one most-visible piece of their mandate. Instead, I've been telling folks that my org's mission is to make the company's data more valuable, and as a head of data, it's my job to think holistically about what data we have, how we can use it to improve our business, and what gaps we need to close to enable those outcomes
Dec 8, 2022 6 tweets 3 min read
@pdrmnvd A highly idealized version:

1. Clarify and refine your overarching question with the stakeholder, then think through sub-questions you might need to answer along the way
2. Ask teammates or search knowledge/write up repos for prior art on the subject @pdrmnvd 3. Inventory available data to figure out what's actually possible / feasible. Potentially update plans from step 1 and check in with the stakeholder
4. Check data for integrity and quality. Potentially debug queries or data sources depending on what I see
Sep 3, 2022 9 tweets 3 min read
@vboykis in my days working for a certain popular news aggregation platform, i remember running an experiment for a new recommendation algorithm. it was relatively naive, just recommending similar subreddits as measured by subscription overlaps using jaccard similarity @vboykis we hadn’t run a lot of recommendations experiments at that point, so we also included a variant that was totally random, just to gauge the impact of recommending content generally
Jul 30, 2022 13 tweets 3 min read
a difficult part of managing data teams is that they're a lot flatter than, say, engineering orgs. while there are many things to love about flat organizations, they require their leaders to context-switch between thinking at a few levels of abstraction all people leaders start life as front line managers, folks who directly manage individual contributors. they have usually done the job they're overseeing and are largely responsible for quality of work. they probably have direct knowledge of most of their team's daily activities
Jul 10, 2022 14 tweets 3 min read
i have mixed feelings about this one roundup.getdbt.com/p/analytics-is… i agree it's not the data team's job to convince the company it needs to use data, and i agree it's worth being explicit this about since a lot of companies seem to operate under the model of

1. hire a data scientist
2. ???
3. be data-driven!
Jun 17, 2022 13 tweets 3 min read
"data scientist" is a vague job title, but the question in my mind is whether it'll become more like "webmaster" or "software engineer" both are high level terms that could entail to more specific types of work, but webmaster implies you do everything whereas software engineer suggests a family of jobs that people may step into or move between depending on what their situation requires of them
May 18, 2022 11 tweets 2 min read
relevant thread re: the conversations about tools and skillset vs. impact that have been circulating on data twitter recently. tools and skillset are a reflection of previous work environments, and while this may seem unfair, there's a reason for preferring those backgrounds as we’re seeing analyst-style roles become more technical (by which i think we all really mean more like software engineers) we’re starting to see a big shift in working styles to be more team-oriented
May 8, 2022 8 tweets 2 min read
I've seen it said that engineering teams should have 6-8 developers because that's enough people for a sustainable on call rotation. The thing I like about this most is that it's org design based on constraints rather than demand Implicit in this guidance is the recognition that all systems will eventually need maintenance to keep working, and if you don't budget in time for this, the team will fall behind on scheduled work
Apr 1, 2022 11 tweets 2 min read
I'm starting to think that a big part of why data folks hate the helpdesk or "service" workflows is because it implicitly means you are responsible for a miserably broad scope It's hard to do data work about a subject you have little context for, especially if you don't know about the relevant data or the stakeholder who's making the request in the first place. If literally anyone in the company can ask you about anything, you're gonna have a bad time
Dec 2, 2021 17 tweets 4 min read
Hi, I'm Katie! One of the main things I use this account for is writing threads about data science management. This Thread of Threads™ is a collection of some of my favorites What is the difference between a data science manager and an engineering manager?
Nov 28, 2021 6 tweets 1 min read
Data scientists share superficial similarities with software engineers, but they create value in different ways. For evidence of this, look no further than when a PM makes a request of them Requests made of software engineers are generally vetted somehow. By the time a SWE hears from a PM, it’s because there’s a specific problem they want solved
Nov 18, 2021 12 tweets 2 min read
It's high planning season, which means it's again time to decide how the hell we measure data teams' impact. While this has always been a source of handwringing for data leaders, this year I've got two categories of metrics that I not only like--I'm actually planning to use them The first category is Adoption of Artifacts. It's very much in the tradition of treating-data-as-a-product and straightforward to formalize--for any platformy or tool-like deliverable your team creates (dashboards, tables, libraries, etc.), how many people use it and how often?
Oct 15, 2021 9 tweets 2 min read
No matter where you sit in an organization, it's easy to look at the level above you and wonder what the hell they're thinking. When you step into a leadership role yourself, it can be a shock to realize that the folks on the level below are thinking exactly the same about you Controlling chaos is a central goal of organized life, and as your role changes, you find yourself faced with classes of problems you never could have anticipated. Your job as a people leader is to prevent them from being passed down the management chain, which is no small feat
Oct 11, 2021 10 tweets 2 min read
Few phrases have more power to demoralize a data scientist than "product intuition." It evokes carelessness, a lack of discipline, and sometimes even tyranny. It's the favorite justification of a product manager who does whatever they want regardless of what anyone else says It reminds DSes that some people don't care about numbers and probably never will, and worse yet, that those types of people are leaders at the companies we work for. Bringing empiricism into environments like this feels impossible, but it can be done with a careful approach
Sep 21, 2021 13 tweets 2 min read
Every so often I see a Twitter thread where a new manager asks for advice, and inevitably, someone in the comments below says something to the effect of, "hire awesome people, listen to them and get out of their way"

Let me tell ya, this is not advice. It's wishful thinking This advice has its roots in the servant leadership philosophy, which holds that a leader’s primary job is to help their people be great at what they do.
Aug 28, 2021 32 tweets 5 min read
DS teams' work is often referred to as "consulting," an appropriate term for swooping in and lending your expertise to a group you might not be a permanent member of. I generally find the consultant framing useful, altho at first blush some concepts may not seem transferrable Take billable hours, for example. Activities that are done in direct service of a client--doing research for their case, planning and executing the work, corresponding with them (in all forms)--the time spent on those things are what you directly invoice your client for.
Jul 18, 2021 8 tweets 2 min read
The push to execute quickly--to "move fast and break things" is easy to vilify. It thrashes the team, leads to accumulating tech debt, and tends to be motivated by poorly weighed risks-reward tradeoffs. But IMO, speed is not the enemy as much as a lack of humility is If you're moving fast to test hypotheses and figure out what works in practice vs. theory, you're going through a helpful exercise of reducing risk. It prevents you from investing a ton of effort into scaling something that leads you straight into a dead end
Jun 13, 2021 14 tweets 3 min read
It's fun to speculate about what direction DS as a profession is going, but it's also instructive to dig into how it grew into its modern form. This paper's an example of that, a snapshot from a few years before the term itself was coined circa 2008 projecteuclid.org/download/pdf_1… The author Leo Breiman (who you may know from such greatest hits as bagging and random forests) talks about going from stats academia to working in industry as a statistical consultant. When he eventually returned to academia, he experienced a sort of reverse culture shock
Feb 13, 2021 9 tweets 2 min read
The surest sign that a company has no idea how to work with Data Science is requesting Insights™ as its primary output. You can tell just from the word--it's vague and sort of mystical, which is not exactly how you want to describe your quantitative teams Yes, data analysis is about convincing someone (perhaps yourself) that something is or isn't true, but it takes a special kind of talent to do it consistently. It's easier to remember your analyses that changed someone's mind because it's easier to remember unusual events