Preview: Interview with Abi Noda for the Engineering Enablement Podcast

I had an opportunity recently to talk with one of my favorite podcast hosts, Abi Noda. He hosts the Engineering Enablement Podcast and I had a lovely time chatting with him. Stay tuned for the audio to get released on his feed!

Abi Noda:

Scott, so excited to have you on the show today. Thanks for your time.

Scott Burger:

Yeah, thank you for having me here. It’s an honor to be on one of my favorite tech podcasts that I’ve… I can’t believe I discovered this podcast only recently. I feel like this is something that I’ve been listening to on my daily drives every morning to really get my brain into moving for the morning and getting thinking about engineering metrics and really this podcast I think really helps prime me for the day.

Abi Noda:

Well, I’m happy to hear that and excited to share your story. And I reached out to you because I mentioned I had stumbled across your profile and seeing this title of yours, head of engineering analytics, of course at Qualtrics, which is a very analytical data-driven company. And so really excited to dive into a little bit at the beginning here about what your role is, what your focus is today. Let’s start there. Share with listeners a little bit about your role and what your focus is today.

Scott Burger:

Gosh, a little bit. The idea of being the person in an engineering org that is sort of wrangling as many data related cats as possible, I think kind of sums up my job description in a nutshell. Everything from Tableau server wrangling, data governance, engineering operational review metrics, designing KPIs and metrics for board level consumption, even things as nebulous as how do we corporate communication roll-outs from ENG features to basically everything in between. So it’s one of those things where it’s like, “If you can think of it, I’ve done it.”

Abi Noda:

And you’ve been doing this four years now. How did this role come about originally?

Scott Burger:

I’m always really amazed at how good of questions that you ask on this show because I’m always like, “Well, you have just such a way of diving in.” How did this role come about? I think Qualtrics has always, like you said, been a very data-driven analytical company. And in the pre-pandemic world, it was just such a growth phase. Engineering org was growing and growing. We had all these different reports and things that needed to be wrangled and made sense of, and there were just a lot of new questions being asked about the business. And this was one of those things where the engineering staff at the time were like, “We’re all pretty analytical people, but we need someone to come in and be the data guy.” And that wound up being me.

Abi Noda:

And originally what were the primary questions? I know this was right before the pandemic and so maybe a little bit right before the wave of questions being asked about productivity impact of the shift, which I want to talk about later. But what were some of the biggest questions initially or was it just a broad view that the data situation needed more focus?

Scott Burger:

Yeah, I think there’s always been a culture of wanting to have the most robust and reliable eng org that we can get. And what are the sort of metrics that help drive that? What are the things about data center reliability, data center costs, making sure engineers are focusing their time on the right things, code quality, really just I would guess the standard things that any company of a Qualtrics size, which is about three 4,000 people that kind of like mid-sized company is pretty concerned about. Obviously we’re not a Microsoft size, so we don’t have to worry about problems quite near that scale.

And also, we’ve graduated out of the really small startup phase, so we don’t have those kinds of questions to worry about also, but you have those kind of meaty middle questions of the company’s in its adolescent stage. And you want to make sure you’re in a good footing in order to have a future growth on top of that. So I think things that we were initially interested in thinking of when I first joined were things about that, things like how are customer pulses being resolved? What does that whole universe look like? It was very much a, “We have a lot of data, but we need ways to organize and make sense of it and make it digestible by senior leadership as well.”

Abi Noda:

And where does your… Let me rephrase that actually. Tell me more about your function. Is it just you, is there a team and does this sit under engineering, people ops, data engineering? Where does this sit within the work?

Scott Burger:

Such another great question really. These kinds of things are never one person ops as much as I would love to take credit to be like, “Oh yes, I’m the only person doing all of these things.” I mean functionally it is a team of one technically, but I’m surrounded by such talent around me from our core team of data engineers who kind of do everything across the company to working with people ops, to working with quality engineering folks, that kind of thing. So really it is me and a handful of people that I rope in from time to time to help with stuff like, “Hey, I need access to this people date over here. Can you grant me this thing? Hey, can I access this financial data set temporarily to build out this thing over here?”

And really it’s a cross-functional team effort where the problem applies. So for example, for corporate communications, if we have enged TPMs who are saying, “Hey, we have this new feature that’s going through the launch readiness process and we need to message it.” They need a data set that they can do a listserv out to. And I’m the person that runs that data set and that data set is a thing that branches across our brand of data to our launch readiness data and financial data and kind of everything in between. And I have to rope in all those people from different sides of the company to get those things working. But it’s a concert, it’s an orchestra, and there may be one conductor at any given time, but the music doesn’t happen without the players.

Abi Noda:

So where does the function sit within… What part of the org?

Scott Burger:

So up until pretty recently this had been in the engineering org and pretty recently we have reorged the company where before we had products, often one silo doing their own thing. We had user experience in another silo doing their own other thing. And then of course we in engineering over here kind of doing our own separate stuff and having to deal with the challenges of people working in their sort of respective silos and all those good things.

And relatively recently we reorged, we decided to combine product user experience and engineering into PXE, P-X-E, for one kind of cross-functional org where we’re knocking down walls and breaking down barriers so that product people can be more intertwined with the launch readiness process. And engineers can have a better insight into what product areas their features are going out to, that kind of thing. So the long story or the TLDR is that this now sits under the PXE org, which is this sort of amalgam of products, user experience and engineering, and it’s effectively an eng operations team that I sit under. But we really try our best to be that kind of interoperable, cross-functional organization.

Abi Noda:

Makes sense. Okay. So now I want to determine where to go. I mean, I think getting into this critical, the 50 different KPIs for engineering leadership seems like the natural place to go. Let me just try to think. I want to cover that, the full story arc there. So was that, just to help me post my… is that what you started working on right from the get go or was this a project that was born? I don’t know…

Scott Burger:

This was something that kind of evolved over time.

Abi Noda:

One of the things you were sharing with me that you’ve done over the past four years is effectively led the design and implementation of KPIs for engineering leadership, C-suite, even the board. I want to talk about that journey. How did you begin going down that path?

Scott Burger:

Yeah, I guess in the time that I immediately joined on, I was very fortunate that Qualtrics has a pretty flat org structure in the sense that people are very approachable and being able to design a metric for senior leadership is a pretty easy thing to do. And being able to get it in front of them and have their eyes on it is for better or worse, sometimes really, really quick where if someone’s like, “Hey, can you spin up this thing for me?” “Yep, you got it.” And then you start to get a lot of attention that way. But yeah, I think there’s been a journey to this where we used to have this internal corporate portal for metrics and it was a thing that you would access from sort of the main internal company site.

You come in, you land on the internal company site and there’s this one little button up at the top that says reports. It’s very easy to miss. Click on that. And there’s reports from all over. And we had a handful of engineering metrics on there of various states of maturity and when was the last time someone actually looked at this kind of a thing. And one thing that I was tasked with doing was like, “Well, which of this stuff do we actually still care about? And which of the stuff can we cut out and which stuff is actually useful but needs to be redesigned in such a way that it’s more useful now?” And so of course starting with eng metrics, it was a lot of stuff having to do with pulse metrics, quality type things.

And basically going in and digging through ancient Tableau reports and ripping out the data and rebuilding it in a way that’s like, “Ah, instead of these garish tables on tables on tables, you just have a nice simple bar chart. If it’s green, it’s good. If it’s red, it’s bad,” kind of a thing. But there was a lot going on in that portal there where we had metrics from sales, finance, lots of other parts of the company all in one spot. But anytime that you would want to put a report in there, you’d have to submit a merge request to the internal development team. They would have to embed it in an iframe and all of this rigmarole to get it to work. And one way we decided to break down a barrier for this was to integrate our company wiki into this process and being able to wiki-ify our metrics.

And what that means is being able to instead of, “Oh, I’m in engineering and I have this metric here or this dashboard that’s really cool and I want more people to see it, but I got to submit a merge request and go all through these flaming hulu hoops to get it productionaized and everything.” We wound up getting a little bit of buy-in from that core development team to say, “Oh, well if you can apply that iframing business to our internal corporate wiki, I’ll take it from there and I’ll get all this stuff out of your hair.” Where now instead of you guys having to manage all of this stuff, I wound up taking over management of basically all of that whole, “Where do people go to find metrics?” Because, so we have this little search bar on the internal corporate website.

You type stuff into there and it will index over the internal wiki. So it really knocks down a barrier there of being able to say, “Ah, I remember seeing this report, it had some keywords and stuff on there, but in the old system type type type, it’s not showing up because it wasn’t indexing over that reports bit.” But now because it’s going over our internal wiki, it’s able to find that stuff no problem. And also being able to apply some crucially needed data governance steps on top of there of, “Got a report, that’s great. Who owns it? How often is it updated? What other sort of wiki magic can we apply there with categorization and data taxonomy stuff,” really I think just helped to alleviate pressure off of that internal development team. But also empower other people to really take the ball and run with it for metrics that they have that they really wanted to report on but just couldn’t because of that bottleneck.

Abi Noda:

So you unblocked the organization as far as self-serve access and discovery around metrics and insights. How did you handle the actual design? Have you been in a role where leaders are coming to you with, “Hey, we need information about code quality?” Or are you consulting with leaders as they try to build their own dashboards?

Scott Burger:

Yeah, again, another really great question because this is a thing that I’ve been involved with re-implementing those existing data sets to be more grokkable and being able to really help follow the threads for action plans. And some great examples of this are old dashboards that we would have for customer pulses and things would be like, “Hey, something’s wrong over here.” And that was basically the extent of it. But now we have a thing that’s like, “Hey, here’s this particular part of the company, here’s how it’s…” There’s just a lot more detail involved. And being able to drill down to individual things in JIRA and give people the ability to click, “Oh, here’s a pop-up link to follow to the actual JIRA item in question,” I think has been really helpful for engineering leaders to really chase down problem children that may have been in the system for too long and people just forgot to close out or things that are way past their SLA that have a really high priority to them.

And the design of these things always has to be very conversational. Because if you as a data person have this visualization in your mind of like, “Well, this makes sense to me.” And you build it out and you deploy it, someone might not think the same way and they might encounter that tool that you’ve built and be like, “Well, where can I find this information on the dashboard?” Or, “How do I figure out what my next steps are?” And having that back and forth conversation about the usability of the data products that you build is really key and is really important and really something that you need to have at all levels. Whether it’s junior people in the company using your stuff or senior executives that are looking at your thing every day, understanding their use case and how they plan to build action plans off of it is super critical.

So the design of these things can be from the conversational level of ground up of what’s the logic that we use to build out these tables and how do we expand that so that other people can build off of it? Or, “Hey, we have this dashboard here, this is how your org is performing right now, and here’s the steps that you need to take if something is going bad.” I guess also crucially, the other side of that coin is, “Here’s the things to celebrate that are going well.” Not everything is bad with engineering dashboards. There are sometimes that things are going really good and you want to highlight those things as well. So where do you strike that balance depends on the data problem that you’re solving. A lot of really hard visualization questions that really just come from conversations with your stakeholders.

Abi Noda:

What I want to get into is maybe more of where you’ve… this is an aside question. For the operational metrics, is that an initiative where you’ve designed, “Hey, here are the 50 KPIs.” Or is that the end product of a bunch of people building a bunch of metrics? I guess that’s a clarification for me.

Scott Burger:

Yeah, it’s definitely more of the latter because… Well, there are some metrics that people will… this is a thing that we’ve only just gone through this exercise a few times over the past couple quarters of executives are like, “We have an idea of something we want to measure.” Well, what is that thing that you want to measure, really? So there’s aspects where I’ve been involved with scoping out and building out that definition based on what I know of the data and the limitations. And then other times there’s like, “Well, here’s other metrics that we want to track that are more kind of mature things that other people have developed.” That’s great. How do we incorporate that into this one-stop shop of the data universe that execs can go to?

Abi Noda:

So you’ve built this self-service platform where metrics and insights are… folks can come in and build their own. They can easily discover things that already exist. I’m sure one aspect of your job is executives coming to you and saying, “Hey, we have an idea of something we want to measure. How do we do it?” I would love to hear some of your recent experiences with how you’ve dealt with this.

Scott Burger:

Yeah, I think executives like all humans are curious about the world. And I think when sometimes a nebulous request floats by your desk of, “Hey, we should measure code.” And you’re like, “Go on. What is it about the thing that you want to measure? What are some details?” There was a recent episode of your show where someone was being asked, I think this was the person from LinkedIn, they were being asked, “Hey, how many employees do we have?” It took them six months to answer the question, and I was sitting there, I was driving into work and I was listening to that. I’m just nodding my head in agreement. I’m like, “Been there.” So yeah, I think when you have requests come in, it’s important to put yourself in the shoes of the requester. And if they are asking questions of like, “Hey, we want to know how good is our code?”

Or some kind of question that’s super nebulous like that. There’s lots of follow-ups that can pivot off of there like, “Well, do they mean how good is our code in terms of how many bugs we have or how many code refactors that we have to do or time to getting the code into the code base sort of things.” There’s lots of different ways that you can ask that question. So really when you have requests like that that come by, it’s important to get more details and really drill down like, “Well, are you asking about this or are you asking about this?” And sometimes to even get ahead of that of executives are busy, they don’t want to be back and forth in conversations all day and you come back and say, “All right, your ask was this. We have these five different metrics that we’ve built out, which of these seem closest to what you’re looking for?” And in some cases we’ll say, “Yeah, I’ll take all five. That’s great.” And yeah, in recent developments of these, we have been trying to build out what’s sort of like the KPI hierarchy in a way.

What are the six to 10 metrics that execs care about the most? Which of those fall under engineering from my purview? And then what are the sort of strategies and ways that we can build those out and what happens when we encounter an issue with that and we need to go back and reset the definition? All those kinds of challenges. I think that’s an ongoing thing that really, we’ve had a lot of fun tackling over the past couple quarters. But really it’s a thing that always just kind of exists in the background where execs will be interested in how the company’s evolving in this way or that way. And sometimes you’ll pivot on that question with a slightly different theme or variation over time, but sometimes it’ll just be the same kind of question of, “Are people submitting pulses because they’re mad? Or are these all things that were actually… works as designed? And do we need to go back to the drawing board in terms of how do we do our pulse prioritization process,” for example.

Abi Noda:

Do you find yourself mostly getting these requests from senior leadership or are you also getting similar requests from middle management, even frontline managers?

Scott Burger:

Yeah, I think the non-answer is a mix of both. I think more recently it’s been senior leadership largely because we’re trying to build out this sort of hierarchy of metrics. But yeah, there are times where middle management will want the drill-downs, where if you build out a top level engineering dashboard for pulse action plans or something like that, the top level view that the executive is going to see is either just a green or red dot. And that might be fine for their case, be like, “Well, is it good or bad?” But their next step is like, “Well, if it’s bad, I’m going to go down the chain of command and see how people are designing their plans to mitigate it.”

And that’s where the nuance comes in with working with sort of the middle management layer of, “Aha, yes, we have this…” And a lot of this, you’re trying to think three or four steps ahead of the process of like, “Okay, I need to build out the data set such that it can do all of these things down the line, but also be flexible enough to provide that simplistic view up at the very tippy top.” So very tippy top, you have very simplistic view. And then the drill-down from that would be like, “Oh, well, here are the product areas that are experiencing the most pain.” And then you might go the TPMs who own those product areas and say, “Hey, what’s going on here? It seems like you might have some insight as to why this thing is going sideways.”

And their answer be, “Oh yeah, we’re aware of it. We just pushed the new feature. Don’t worry about it,” kind of a deal. Or it might be a surprise to them. They might be just as curious about it as you are, in which case your data strategy here really needs to empower yourself and the stakeholder to say, “Okay, if this is a thing that we need to build an action plan for, how can I best help you to be empowered with that data.” And the data strategy of, “Ah yes, we have all of this data in here that’s broken out by PU and maybe even down to the atomic level of these individual items.” Being able to empower middle managers with stuff like that where they want to be able to see…

I was just talking about this with someone yesterday of, “Hey, we have this top level dashboard and our eng directors want to be able to pinpoint for these various product units, what are the JIRA projects involved with it? What are the individual items that might be the key drivers for why this thing is going sideways?” Yeah, I think the top-down approach of senior executives coming and saying, “Hey, we want a dashboard to do this.” You have to go in with the mindset of knowing, “Ah, well, there’s always going to be a next step and a follow-up to that.” And putting yourself in the shoes of, “Okay, I’m the eng TPM middle manager person who’s gotten this request from the mountaintop and I need to do something with this in order to provide an action plan. What could I, as a data person provide them to help them get over this hurdle?”

Abi Noda:

One of the things you’ve touched on a couple times is this hierarchy of metrics and this ambition to maybe distill things down to a set of six to 10 top-level metrics that are shared or focused on by engineering. I had love to better understand the aim of doing that. Is this driven by executive or board-level leaders saying, “Hey, we need a consistent set of KPIs.” Or is there a different motivation for trying to really categorize or distill down the insights into a smaller number?

Scott Burger:

Yeah, I think the first part was probably more towards the answer of here are things that the C-suite and board really care about. Some of those exist within the realm of engineering, but some of them also cross-pollinate as well. If we have a metric that is aligned with how easy our product is to use, that might influence people’s likelihood to renew, for example. So there’s a lot of interplay between these metrics such that now very rarely is anyone really in their own protected silo of just like, “Well, this is my data and my universe and code quality and pulses and things like that don’t influence any other part of the company.” I think it’s more of how do these sort of spark the next level of metric drill down questions where one of the top level ones might be our ease of use metric.

A click down from that might be, “Well, what does the overall input and output or the total flow of customer-requested tasks and issues look like? How are they coming in from the front line? Where are they going?” It just opens up this pyramid of metrics where up at the top you might have a few, and then you have just walking down the chain like, “Oh, these are the things that build up into that top level metric.” But you really don’t want to overwhelm a C-Suite and board with like, “Oh yes, well, ease of use is driven by these 10 metrics and these 10 metrics are driven by these other 10 metrics.” Like any good person, their eyes will just kind of glaze over and be like, “Yeah, okay.” But yeah, I think the top level metrics are really there to just provide a signpost of here’s how things are going generally. And obviously the situation under the hood is more complicated, but you have to have some kind of dashboard warning light to suggest things going good or bad.

Abi Noda:

So I know you work with different, let me say [inaudible 00:37:51]. So Scott, you work with a broad universe of different type of data from surveys, from your products, from internal tools. I know one of the challenges you’ve worked on is reporting off of JIRA data. I’d love to hear, this is something we see… sorry, this is something… this is something I hear from other leaders all the time that it’s difficult, especially at an organization if you size to get JIRA into a state where you can do reliable reporting off of it. Curious to hear your journey.

Scott Burger:

Yeah, I’ll be happy to take up the next three or four episodes here with just talking about JIRA data challenges, but I think I might be one of the few people in the data sphere that actually doesn’t mind JIRA data, but maybe that’s just been using it for so long. For challenges and journeys associated with it, when I first came to Qualtrics, I hadn’t really used JIRA much at all because I had used so many other kind of project management systems. JIRA was sort of new to me and it was kind of a new toy to play with in a sense of, “Oh, this is fun and different.” And then the deeper that you go into it, the more challenges you see of with any sort of user editable universe, things being very transient in nature and potentially changing long after the fact.

So you might see things in the data that’s like, “Oh, here’s how many JIRA tickets were resolved,” noodling along at a totally normal pace and then bam, this huge spike on one particular day. I’m like, “What’s the deal with this?” “Oh yeah, that was like 3000 issues that were closed out in one day because they were all deprecated and didn’t need to be in the system.” But anyway, so I think a major challenge with JIRA data that I think some people who deal with Salesforce a lot would be nodding in agreement are things that are lots of human input data and the challenges that associate with that. And a good example of this recently is one of our data strategy challenges with a top level usage metric, or not usage metric, but a top level board metric where in order to define aspects of this metric, we had to rely on JIRA data.

And we had to really go through a committee process of like, “Well, what is the specific question that we’re trying to ask and answer? And how do we go about getting a reliable data set to do this?” “All right, well we’ve done enough finger in the wind, let’s actually go out and build it. Okay, type, type. All right, this looks like a pretty reasonable data set. But there’s a challenge with JIRA data in that, or I guess maybe any data set that has a transient phenomenon to them where on one day this particular item could be labeled as a level two pulse and then, oh, actually this needs to be escalated. This needs to be escalated to a level one pulse six days later, but maybe that’s in a different month and that winds up screwing up your data accounting.

These are all kind of challenges that we kind of knew in the back of our mind we’re like, “Well, it probably won’t be that big of a deal.” But if you have some eng teams that are being graded off of those things, you want to be really cognizant and aware of them. So I think there’s an element of JIRA data that is challenging from the aspect of things can change, but that’s just human nature. We learn stuff about our engineering processes and if we don’t allow ourselves room to change and update things as we go along, then we might just be kind of stuck in these static sticks in the mud of like, “Ah, yeah, well this is what we reported on previously, but that’s how it was. And things have changed now.” I think the trade-off for being flexible is worth it, where if you have dashboards that are reporting, “Hey, here’s what the snapshot was a week ago, here’s what the snapshot is now.”

That might be good for certain purposes, but there might be other purposes where like, “Well, I need a live snapshot right now.” And times where you have to actually go into the JIRA change log and see, “All right, well what was this issue at this specific time?” And being able to build out off of those kinds of change snapshots of a fun thing about JIRA data that I’m sure one person listening right now will be pumping their fists in the air of like, “Yes, I’m not the only one that’s dealt with this.” Is that you have some JIRA data that’s like current state and some JIRA data that’s based off of changes. And there’s a bit of a data lawyering thing going on under the hood sometimes where if someone in JIRA submits an item and they didn’t change anything with it, it doesn’t show up in the change log because technically, air quotes, “it wasn’t a change”.

So you have to be cognizant of like, “Well, if we use JIRA change log stuff for one thing, we might have to bring in what the current state is for other things and be flexible enough to do both in some cases.” So it can be a real challenge in some specific instances, but I think as long as your stakeholders are informed and aware of how these things operate, and I think everyone at Qualtrics is really understanding of JIRA for better or worse. I think another challenge that we have is data reproducibility, and maybe not necessarily a challenge. But another data strategy thing for listeners to be aware of is that if you have your JIRA data from the backend API system dumping into your data warehouse and some such thing, you might be more empowered with specific kind of query tools at your disposal than the poor engineer who only has access to the JIRA UI and a very simplified set of SQL commands that they have via the JIRA query language.

One thing I deal with a lot is I have these very bespoke and robust dashboards. They have a lot of pretty heavy Redshift logic built out into them, and they all work great and they all provide all the source data and all those good things, but inevitably someone will want to say, “Hey, how do I build this out in the JIRA UI?” And I have to take a step back and be like, “How would they?” Because SQLite can only give you so many options in terms of window functions and row numbers and things like that. And so I think there’s a challenge there with JIRA data of more generally of just people wanting to reproduce what you have in a dashboard for their own purposes. And the strategy that I found to be useful is just make the source code available.

And that might not be a solution for a lot of people because they might look at it and say, “Well, I can’t do window functions, I can’t do CTEs in the JIRA query language.” But what they can see is, “Oh, well, this is the sort of logical flow that the data is going through.” And if you can’t build out a thing in the JIRA dashboarding utility, that’s probably okay, come to us, come to the ops org and we’ll build it for you. We’ll be able to give you the keys to the high-powered data Ferrari that you need. So yeah, challenges with JIRA data, gosh, I could go on this topic forever, but I think they’re interesting and there’s always something new to learn with this kind of data.

Abi Noda:

One interesting project you told me about earlier was studying the impact of the pandemic on developer or engineering productivity. Tell me about that analysis. How did you design the analysis? What types of metrics were you looking at and what were the findings?

Scott Burger:

Yeah, this was something that I think people were pretty keen after the first few months of the work-from-home experiment to really kind of keep an eye on that, “Hey, this is something that’s not really affecting our code base, is it?” Because we have a lot of people who are in a different environment now. Maybe work from home isn’t suitable for everybody, or maybe there’s some people that really excel in this new environment. And one of the items that I was tasked with was how do all of our currently measured eng metrics look like six months before and six months after the specific date that we had for, “All right, everyone grab your monitors head home. We will see how long this takes.”

And one surprising thing that I found out of this, I was going in with the hypothesis of like, “Okay, this bit will go down, this bit will go up.” But the main surprising thing that I saw from it was that there really wasn’t much of an impact, which was surprising to me because this was something that was really… well, I mean, even for me where I have a very robust work from home office already, the transition for me really wasn’t that hard. But I was putting myself in some poor engineer’s shoes who they have to come into the office because their houses or their small apartment is really not well suited for work from home stuff. There might be noise distractions, they might have their very energetic cat jumping up on them every five minutes.

Those are things that could affect developer productivity. So we were looking at things like how long does it take for merge requests to get put into production? How many commits are people putting through, these really basic atomic level questions about our code base? And the conclusion being that, well, in the six-month period before and the six-month period after, there really wasn’t a statistical significance between the two, which I personally took as kind of a win really. Because there’s so much turmoil going on with that sort of change in environment that anything could happen and a massive decrease would not be unexpected. A massive increase would be great, but a fact that there really wasn’t any statistically measurable change in things like that was very surprising.

Abi Noda:

Interesting to hear the findings. I imagine that right now, GenAI developer tools are a similarly hot topic or area of interest for leaders at Qualtrics. I’m curious if you’ve run any analysis using the similar metrics on the impact of GenAI and if so, if there’ve been any findings there?

Scott Burger:

Yeah, really great question. I haven’t done any kind of analysis with that as of yet. We do have at Qualtrics an internal AI language model assistant that’s been quite helpful. We’ve been using a GPT-4 Turbo for our AI-assisted sort of stuff. And it’s been really helpful. I think there’s an interesting element to explore in terms of engineering-related metrics to this for something tangentially related to code reviews. And if you have a very enthusiastic robot junior dev sitting next to you at all times who’s more than willing to spit out some code for you, “Oh, that’s great, but do I have to be adding that to my code review time that I’m spending with all of my other junior devs?” So I think as of right now, that’s not something that we have been measuring.

But yeah, that universe of AI-assisted code development just has its own entire universe of metrics tied to it. Like what kinds of questions are people asking? How in-depth are people’s prompt engineering going into it? Is that something that we need to train engineers up on in order to help reduce their paired programming time with the AI assistant, that kind of thing? It’s almost like this entire parallel universe of engineering metrics instead of things that you’re measuring for the human engineers, you’re measuring with human and AI engineers. I tend to think of this from a chess analogy as well, where you have humans versus humans in chess, and you can only get up to such a high level there.

And you have just pure AI-driven solutions for chess these days that can be even better than in humans and humans will never catch up. But you have an aspect that’s like the human plus AI bit where that is almost like on a totally other level. If you have Magnus Carlsen working with AI assistants with chess, who could stop them really. And that’s the vision that I have for software engineers working with AI as a tool to help really 10 x and exponentiate their own productivity, which would be another great six month before and after kind of measure of how has people’s code been before we rolled out this tool and after. Yeah, there’s a bottomless amount of questions that I would just love to ask and answer. Maybe I can get the AI assistant answer for me.

Abi Noda:

I was going to ask you about how you think about the best ways of getting data to answer some of these questions. We’ve talked about productivity, we’ve talked about some of the challenges of JIRA data. Qualtrics being qual focused, I’m curious, in what cases do you think about getting survey-based data versus system-based data, and what’s the responsiveness and receptiveness of your leaders in both types of data or in using both types of data?

Scott Burger:

Yeah, that’s a good question because being a very quantitatively focused company, we really just have to follow the numbers and see what the numbers say. And a lot of times there’s things that you can’t measure from just a pure database code completion perspective. So being effectively a survey company helps out a lot for things like, “Hey, how are you feeling on your team? How is your developer experience on this team?” And being able to analyze sentiment from user text input of someone saying, “This is great.” Or, “I’m having a real bad time.” That’s stuff that helps out a lot from that, the qualitative aspect, being able to be transformed into a quantitative number.

I think in some cases, you don’t want to do that, you don’t want to lose that element. But yeah, there have been a lot of times where we’ve used engineering qualitative sentiment data to figure out, “Well, how is the developer experience at the moment? Is it getting better or worse?” And being the kind of company we are helps to really lower that barrier. And yeah, I think if you try to put everything into a door or space metric boxes, you kind of lose that human perspective a little bit. I mean, again, they work really well in concert with each other. It’s maybe not just a picture of one or the other, but the intersection of both of them really helps to elevate that whole metric process in the end.

Abi Noda:

One of the biggest challenges with metrics and this type of data is answering the next question of, “Okay, so this says our score is a 10. What does this mean? Is this good? Is this bad?” How do you approach answering that question?

Scott Burger:

Yeah, that is a good one because with some metrics, I think it helps to have the curtain pulled back just a little bit so you can see what the data going in is telling you. Because if you have, again, another thing that it’s very timely that asking that question for one of the major end metrics that we’ve been developing and measuring might be comparing team A that has this KPI that’s like 25%. Team B has this KPI that’s like 70%, and the higher the number, the worse it is. And you might look at that face value and go, “Hmm, team B at 70% is kind of a big problem.” Which that obviously should be at the top of our list, right? Well, if you have your data set structured in such a way that you can pull back that math curtain just a little bit, so you can say, “Ah, well this 70% is really just… The total number of data points in it is like four data points versus the other one versus 4,000.”

And clearly one is going to be weighted higher than the other, and this kind of situation is something that you can help nuance for if the end result for this is like it’s going to be dumped in a slide deck somewhere. You can help nuance it from a data visualization perspective like sizing the dot of weighting it according to the number of data points involved, for one example. Another example might just be showing the table just showing like, “Hey, this team had this numerator and this denominator and this total number of data points, their end result was 25%. They’re great, they’re under the curve. But team B, well, yeah, they had 30% over of what we should be expecting of them, but they had four data points, so it doesn’t really impact that much.”

And then at that point, you’re more or less sorting by your total data point weighting value, and you’re using the actual KPI that you’re scoring the teams against as like, “Okay, well, for the ones that actually have a lot of the data and the ones that really matter, here’s what their performance is. And yeah, these ones down here, they might be more noisy, but they’re onesies, twosie kind of a deal.” So I think in that aspect, being able with your data products to have that flexibility and mindset to be able to pull back the curtain just a little bit when people ask those questions is kind of key because people will inevitably say, “Well, how’d you get that number?”

Abi Noda:

I think that’s all the questions I have. Anything else you’d be interested in covering?

Scott Burger:

I guess maybe one question that I have for you, because you talked to so many key figures in the enablement space. I’m curious from your perspective, where do you see that kind of AI assisted developer productivity headed? And maybe that’s a nebulous top board kind of question with numerous clickdowns and things like that. But I get a sense that there’s a lot of consternation and worry that SDEs and other folks are worried that, “Oh, well, a robots just going to come and take my programming job.” Is that something that you see as a legitimate worry, or do you see more of as a cooperative partnership?

Abi Noda:

Speaking as someone who still writes code themselves, I still write code, I don’t have a worry about developers being replaced. I think the biggest theme that I see and hear from leaders right now is that there’s an enormous pressure, an enormous amount of money going into trying to drive adoption of GenAI tools within engineering. And there’s worry about the return on investment, and there’s worry about adoption. There’s worry about, “Hey, we just spent 10 million bucks on these GitHub Copilot licenses. Why aren’t people even using them? And what’s the impact that we’re getting $10 million worth?”

That’s the biggest theme I’m hearing. I think also organizations are just starting to realize that just because you provide these tools to developers doesn’t mean that you will get the value that, and you touched on this, actually, training developers on how to use these tools and what use cases are effective, which aren’t, is a big part of getting that return on investment. And organizations are realizing that this is work that they have to do. They have to understand how these tools can be used within their own environments and then spread that knowledge and enable developers. So those are some of the trends I’m seeing right now.

Scott Burger:

Nice. Is there something that you really pumped or excited for the upcoming DEVEX conference later this year?

Abi Noda:

So which one, which DEVEX conference?

Scott Burger:

Any of them, all of them, which one are you most excited for?

Abi Noda:

Well, there’s DPE Summit. I don’t know if you’ve… is that the one you’re referring to?

Scott Burger:

Yeah.

Abi Noda:

Yeah, that was a really good conference. Scott, I’ve really enjoyed this conversation. Thanks for sharing insight into what your role is at Qualtrics and some of the different problems you’ve been working on. Really appreciate the time.

Scott Burger:

Yeah, thank you. Like I said, this is one of my favorite tech podcasts that I’ve stumbled upon recently. I’m amazed at all of the quality of the other guests that you’ve had on, so hopefully I’m not bringing that average down with me being on here.

Abi Noda:

Awesome. Thanks so much, Scott.

Scott Burger:

Yeah, thank you.

Preview: Interview with Abi Noda for the Engineering Enablement Podcast

Share this: