Episode 3 - The Origin Story of the Omelette

Media Thumbnail
00:00
00:00
1x
  • 0.5
  • 1
  • 1.25
  • 1.5
  • 1.75
  • 2
This is a podcast episode titled, Episode 3 - The Origin Story of the Omelette. The summary for this episode is: <p>Kareem Yusuf, GM of IBM AI Applications and Blockchain tells the story behind the SRE Omelette. &nbsp;He gives the audience insights into why SRE is essential in delivering meaningful services to customers and describes how he measures ROI (Return on Investment) for SRE. &nbsp;Kareem also shares his thoughts on challenges SRE need to tackle next and gives us his recipe for the SRE Omelette.</p><p><br></p><p><strong>Timestamps:</strong></p><p>[02:30&nbsp;-&nbsp;03:58] The Omelette</p><p>[04:24&nbsp;-&nbsp;06:35] What SRE means to Kareem and his customers</p><p>[07:02&nbsp;-&nbsp;11:07] Kareem on prioritization</p><p>[13:11&nbsp;-&nbsp;17:34] Measuring the ROI in SRE</p><p>[19:05&nbsp;-&nbsp;23:36] What does our future hold?</p><p>[23:53&nbsp;-&nbsp;28:54] Kareems ingredient and recipe for the SRE omelette</p>
The Omelette
01:28 MIN
What SRE means to Kareem
02:11 MIN
Kareem on prioritization
04:04 MIN
How Kareem measures ROI on SRE
04:22 MIN
Kareem on future of SRE
04:31 MIN
Kareem's SRE Recipe
05:01 MIN
Building culture
00:22 MIN

Kareem Yusuf: When you think about culture and you think about behavior, you are really driving for an outcome. So it is less about the chicken and the egg, and indeed, let's break that paradigm and let's make some omelets.

Kevin Yu: Welcome to the Making of the SRE Omelette Podcast, the show where we explore the positive business and client success outcome from site reliability engineering and hear from experts on how they influenced the cultural and mindset shift that led to those results. I'm your host, Kevin Yu, and the principal SRE at IBM AI Applications. Today's episode is the origin story of this podcast. As I mentioned in the intro episode, the title Making of the SRE Omelette was an inspiration from my general manager, Kareem Yusuf. Kareem is the general manager of IBM AI Applications and Blockchain. For context, AI applications is the backbone of essential business processes, be it for retail shopping, enterprise asset management, or ability to react to weather and environmental events. Chances are, when you're buying that doorbuster on Cyber Monday and checking that store inventory for in- store pickup, those experiences are enabled and powered by Kareem and his team. Welcome to the show, Kareem.

Kareem Yusuf: Thank you for having me.

Kevin Yu: So Kareem, let's start with the story behind the omelet. I remember it was a conversation we had around driving cultural and mindset shift. If we were to go back in time, I recall I asked you, " How can we solve the problem that people don't have time to make things better when they're so busy with existing tasks and putting out fires?" And I gave the analogy that it is a little like chicken or the egg in that we know things will get easier if we put our automation to detect and mitigate issues, we will spend less time reacting to smoke and fire, but we simply don't have the time. Hence, a little like chicken or the egg. And you said, " Kevin, culture is the outcome of what we do. In the context of the chicken or the egg, it is like an omelet." It was not only a clever capture. It really helped us to focus on driving the behavior to serve as the outcome we want to achieve with SRE. Can you give the audience here a little more context on that response?

Kareem Yusuf: In many a way, it was almost about breaking the paradigm or shifting the perspective you had as you engaged in that question and we talked about that analogy. I would love to say I was struck by a bolt of great inspiration at the time, but it really was simply informed by a basic perspective that said, " Look, everything we do is really about outcomes, right?" When you think about culture and you think about behavior, you are really driving for an outcome. So it is less about the chicken and the egg, and indeed, let's break that paradigm and let's make some omelets, which you can't make without breaking eggs. If you step back, it means that when you think about those fires that you have to put out and the smoke and all the reacting, well, yet something's got to give. Driving towards an outcome you want allows you to prioritize in some ways or take into context what needs to give ultimately to achieve, I guess I would say, a better day at the end, which was what was really shaping us all around the culture discussion. So that's kind of where it came from, as much a way to break the mental framework of the choices that you were having to face by picking another path. And by picking that other path, also fixating on what were we really trying to achieve, which as we've discussed there was the outcome of culture change.

Kevin Yu: Thank you, Kareem, for the context behind the omelet analogy. The key takeaway I got were the need to understand the outcome we're looking to achieve, acknowledging that something's going to give to break that paradigm, and for us to prioritize what to change with the outcome in mind. So let's touch on that outcome. What does SRE mean to you and your customers?

Kareem Yusuf: As you've already mentioned, a lot of what we do within this business unit around the supply chain realm and even in the asset management world is all about supporting what essentially are essential business processes, and these business processes underpinned by our software require our software to operate effectively. As you well know in the realm of supply chain specifically, a lot of that software is delivered as a service. And thus, the site reliability engineering team, or in the older pylons of SaaSOps is essential. It's critical to keeping those services up and running and performing. Not just up and running, but also performing to spec so that real work can be done. I think it's always important to remember, and I know I've said this quite a few times, nobody wakes up. None of our customers, our clients wake up every single day going, " Ooh, I want to use your software today." No. When they drive into the office, they're not thinking, " Oh, I'm going to go play with order management," or, " I need to move some EDI messages through SCBN," or, " I need to figure out where things are with IBM's transparent supply solution." No, they're trying to get an actual task done, fulfill an order, optimize their inventory levels, understand the current status of things so that they can make informed decisions. The software we provide, often delivered as a service, is a critical enabler in them getting their job done. So inasmuch as SRE therefore is essential to our delivery of that service, it therefore becomes essential to our clients getting their work done, and essential essentially then to the very business that they operate in and use to serve others. So that's what it really means to me at its most basic. SRE is essential to delivering a service in a meaningful and performant way that can meet the needs of the clients and customers we serve.

Kevin Yu: That is a great capture. SRE is essential to deliver services in a meaningful way to the customers. And if done right, it is transparent and allow customers to get their work done. So Kareem, here comes a tough question and one you and I spoke of a few times. For the benefit of the community, could you please share with the audience how you prioritize for SRE?

Kareem Yusuf: Look, prioritization is always the tough nut to crack. It's always the real challenge in any business leadership, in any outcome that you're driving for, because you are always juggling a set of constraining factors. I like to use a basic framework that I learned back in my civil engineering days. It comes from construction project management, right? That any outcome is bounded by time, cost, and quality. Those are your three variables that you are always looking to optimize. You can do something quickly, but it may be at great cost or at the expense of quality. You may optimize for quality, but you may need therefore to take more time, right? And so on and so forth. And I've always found that always applies into this subject of prioritization. So how do you break that? How do you approach it? Well, it's back to the omelet. What is the outcome? The outcome is the customer being able to successfully use the service to get what they need to get done. Now, you have to trade that against what do they need, so those are features, and can the service enable them to get it done? Is the service actually up and running? And so it makes no sense to throw lots of fancy features if you can't actually have an effective service running, because no one will ever be able to leverage the features, right? When you really step back and look at the marketplace, I will tell you that the biggest challenge for us is one of scale and adoption, or put another way, adoption at scale. I'm not talking about scale in terms of number of orders per second. I'm talking about scale here in the context of pervasive use, pervasive use within an organization. That means they need to adopt it for every single project, for every single product line, et cetera, et cetera. Consider whatever be the right units of measure there. And that means that the service needs to be able to effectively scale in their support. So it's one of the reasons why, yes, I think from a prioritization perspective, an understanding where we are with many of the product lines. A lot of focus currently goes into I would say prioritizing SRE sometimes above other features, because as I said, if you cannot drive adoption at scale, it doesn't matter what new feature you support. And key to that, when we say prioritizing SRE, you know my particular bugaboo is really around automation, right? I think automation is one of the holy grails, if you will, of an SRE operation. And the more we can automate, the more we can optimize, the more it frees us up, frees you up as SRE professionals to do the other important work. So there's no silver bullet, to keep mixing analogies on this podcast, but it's a judgment really defined at a point in space and time often by the state of your market, the state of your product line, and therefore on what you kind of need to achieve. My sincere hope as we go forward and as we begin to complete this journey of standardization on a single platform as we're doing with OpenShift, a single way therefore of doing automation with operators and Ansible and the like. My hope is as we build new products, it will no longer be this massive trade- off because a lot of the SRE things we need are kind of built in, if you know what I mean. They're just part of the DNA. Of course, you are going to automate this with operators. That's what you're going to do on day one. It's not a latter- day retrofit. And a lot of these prioritization discussions are where we are modernizing and having to do these latter- day retrofits.

Kevin Yu: Time, cost, quality still very much applies here. Focusing on that outcome with customers' perspective, knowing features are useless if customers cannot use them meaningfully to get their work done. Kareem, I'm glad you shared that there is a light at the end of the tunnel, in that as we become better at it, meaning with more building capabilities, the trade- off becomes less massive. That is really where that shift- left mentality in that we want to be considering SRE all the time not just in operations, but early in the development life cycle to get that outcome.

Kareem Yusuf: By the way, I would say that, think about it, this was the journey we all went on in pure development. Remember when in the development world, we used to spend weeks writing code, then we'd go spend weeks going into test cycles, right?

Kevin Yu: Yes.

Kareem Yusuf: Now, we talk about testing as you develop, right?

Kevin Yu: Exactly.

Kareem Yusuf: But how do you enable that, by the way? What's the secret to doing continuous testing? Automation. But now, nobody's debating the importance of working that way. They may have challenges about getting that way, but you're not going to hear anybody say, " Ooh, I should test later." So I think it's the same thing. We're on a maturity curve. And so in the same way, I would say we should be gentle with ourselves, to be slightly forgiven in the sense that we are maturing. We are getting better at this. And I think the fact that we all now talk about SRE and recognize SRE as an actual profession shows how far we've come.

Kevin Yu: I like that. Make sure we celebrate the successes along the way and recognize achievements. Speaking of how far we have come, I know we're lucky to have a healthy budget for SRE. I'm sure the audience would love to hear your perspective of how you measure the ROI, return on investment in SRE. Or perhaps, how should SRE measure ourselves to make sure that the investment keep on coming?

Kareem Yusuf: Well, a real simple measure of ROI for me is I ain't getting calls, right? When you think about things at my level, it'll be no secret that a big part of my life is based around exception management, right?

Kevin Yu: Mm-hmm.

Kareem Yusuf: I have to fixate where the problems are, which means I'm trying to make moves to minimize where the problems come from. And therefore, if I'm not getting problem calls, it kind of means the investment has paid off, right? Kevin, you've been around as long as I have, back to our early days of WebSphere Commerce.

Kevin Yu: Yes.

Kareem Yusuf: Think about Black Friday as just a simple case in point, and think about how far we've come. I don't mean to minimize it when I say it's almost become a non- event. What I mean to say is that it's a testament to the skills and the focus and the dedication of the SRE team, of the dev teams, of the partnerships that have been built, of the way we have now built software and run a service like OMS. That we're able to tackle these major events almost, touch wood and without jinxing it, without breaking a sweat. Now, I know there's a lot of sweating going on, but you do know what I mean.

Kevin Yu: Yes. Oh, yeah.

Kareem Yusuf: You've lived through a very, very different time, right?

Kevin Yu: Yeah. Yes.

Kareem Yusuf: A very, very different time when we were in pure crisis mode. So right in there, you begin to see the return on investment. But if I move beyond just what I would call things remain up and running, the real return on investment I'm looking for-- That's still coming down the road, and I'll pick a different portfolio to make the point-- is this vision of being able to deliver our software consistently everywhere, no matter how it's deployed. So let's take the Maximo Application Suite as an example, which is probably at the forefront of this discussion. We've modernized it. We've put it on OpenShift. We run a managed service that SRE team is responsible for. Our customers wish to run their customer... manage it themselves on other clouds. So we begin to talk about how are we enabling the right operators and automation to enable them to do that. We wish to deliver a service on other hyperscalers like AWS in response to customer needs and stuff like that. That's what the SRE team is doing. If we can do all of that in a singular way with a singular set of capabilities, we will know that we've arrived. Because in the past, each of those scenarios would have literally been a different code base, right? That's not the direction we're in. That's not where we're headed. That's not the work. It's the work that we're doing in SRE to support our running of the managed service or our supporting a multi- tenant service on AWS that inspires and drives the very same capabilities we put in our customers' hands who want to manage it themselves on one of these platforms. Rinse and repeat. That is a serious return on investment for me. That is a major, I would call it, delivery on the promise of kind of like, " Build once, run everywhere." That is something that I'm watching very closely, because I know how we've done it in the past. And the efficiency, the economy of scale that the path we're on delivers for us as a business should not be underestimated at all. So that's how I continue to measure the ROI. It's that transformation that I'm really waiting on. That's the next stop. It's next stop on the train journey. Keeping things up and running, less fires and all that, that's the meats min. That's the basics. But, truly transforming therefore how we can deliver and where we can deliver service and deliver our capabilities, yeah, that's the next tranche. That's what I'm looking forward to seeing pay off over the course of this year based upon the plans we've already collectively defined.

Kevin Yu: Kareem, I recall the first time we met was actually right before Black Friday at a customer site.

Kareem Yusuf: That's right.

Kevin Yu: I recall we used to joke that the end game for Black Friday is for us to be able to spend Thanksgiving at a resort by the beach.

Kareem Yusuf: You mean actually sit down for dinner with our families and not stand out to go take a call every hour and a half? Yes, I do remember those days.

Kevin Yu: Yeah. Exactly. When it becomes a non event and when it's boring, you know we did it.

Kareem Yusuf: I actually believe it. I think you've just defined the new mission statement for the SRE organization. Making things boring. That is a tagline that we would all embrace.

Kevin Yu: I like it. I'll try to sell that. And Kareem, I mean, I see your point. In a way, you kind of touched on where I was going with this, because I see IBM and the industry is going to... We used be, " Hey, get the customer ready for on- prem." Then it's like, " Okay, now there's SaaS." And now, there's a hybrid model. So I think what you describe is that repeatability, that efficient to scale and adopt is where you see it going. And I think actually one of our customers actually... I remember inaudible actually made the comment, " Hey, if you guys do this with every one of your customers, why don't you have this out of box?"

Kareem Yusuf: Yeah. No. Look, I will tell you, I truly believe hybrid is the future. I mean, this team has known me long enough. I'm not espousing party line here. I truly believe hybrid is our future. I do fundamentally believe that that strategy is the right call. I look at the customers we serve, enterprise customers. There is a mix. There will always be a mix. And our ability to respond to that heterogeneous world not only speaks to our roots in middleware. I grew up in MQ. There was no more heterogeneous piece of software on the planet than MQSeries. Could run on any platform, be spoken to in any language with the same five basic API calls, and it got the job done. So when I think about our DNA, where we've come from, when I think about what I truly think is the real IT landscape that the world is evolving to, our ability to deliver capabilities as a service or to be run on- prem or to be run on other clouds in a consistent model is important and critical. Standardizing on a singular platform with which to do that and driving the next level of application lifecycle management with what we can do with automation and operators and the like, and do that consistently I think is key. Not every product will be delivered in every single form, right? Not all require it. But the fact that we can, and the fact that we will and do it consistently for the right use cases I think is tremendous. And then layering onto that, I talk about hybrid models, but I would be remiss not to add into that the thing that gives birth to us as an organization, which is AI. Let me tell you why that becomes even more critical and what I think it means for SRE. I think, Kevin, you've heard me say this before. When I think about AI, I think about the enabling of intelligent decisions. That says the capabilities we build have to live in situ in the application logic itself, right? It's right there in the UI. It's there helping the user make a decision, whether it is alerting to areas of focus in terms of equipment, whether it is deciding where things should be shipped from in terms of fulfillment optimization. These things are living directly in the application logic flow, right? In the business process to enable key decisions to be made. The question no one is really fixated on and to my mind becomes key to our differentiation is what happens after all this logic is deployed. How is it going to be managed? How do you upgrade models in flight? I talk here about day- two operations. Everyone is still fixated on how you deliver that first model or deliver that first capability. But when you're running this service and you want to begin to upgrade models, or you want to factor in a new version of the model, how is that swapover going to occur? How are we going to manage that? How are we going to automate that? It's stuff we do today with application logic without thinking, but what happens when you want to introduce an AI model with new features? This is now the realm of what the application becomes. These are the next set of challenges the SRE organization is going to be stepping up to. When the application you manage becomes heavily infused with AI, that's going to bring a whole nother set of considerations. Considerations I know that are on our minds, but not everybody's talking about, but what I believe we as IBM are extremely well positioned to tackle, because we actually think about this or think this way all the time. That's where I kind of see things going, and we are well on our journey with all the exemplars and the Petri dishes or the use cases to push us forward in becoming masters of this art. So that's where I think it's going to be, and that's where I think SRE is going to have to step up as well.

Kevin Yu: And do all that seamlessly without interruption to how the customer does business on our platform.

Kareem Yusuf: Indeed.

Kevin Yu: Kareem, thank you so much for sharing that. In closing, I always go back to the inspiration of this podcast. Which is, Kareem, what would be your ingredient and recipe for the SRE omelet?

Kareem Yusuf: Look, we end it with a cooking show. The SRE omelet. Take three eggs, dash of salt and black pepper. I really think that when you think about our success, or success as an SRE organization, or success in building a world- class SRE function, the number one ingredient, the first egg if you like that we have to crack, it's got to be people. It's got to be people and their skills, right? At the end of the day, look, we are a human- based organization. Everything we do is human capital. It's humans. It's us who bring our brains to work to generate this IP every day. So I do think the recipe sits on the people. How do you attract people to this function? Well, first of all, you have to readily acknowledge, and I think we've done a really good job of that over the course of the last year, this is a profession. It's a critical skill that we are going to need ad infinitum and forever, right? It's not going away. So how do you better formalize a profession? How do you better formalize the skills that need to be built around that? How do you create the climate and environment in which people can thrive and do their best work? These are all the elements and the ingredients of, to be honest with you, creating any exemplary functional organization. It's as true for dev organization as it is for SRE, as it is for product management, as it is for marketing sales. I could go on and on and on. In the world in which we operate, it ultimately all begins with the people. And people, generally speaking, want to do meaningful work. So I think in that recipe it is, and as you achieve here with your podcast, making sure we're reinforcing that notion of the meaningfulness and the importance of the work, creating pride in mastering our craft in being an SRE professional, and therefore clearly understanding what that means and what growth in that context implies on how those skills can be parlayed into other professional choices as people move on down the road in their careers. Not many of us, I for one, have never stayed in just one job or one particular functional tier. I've done tech sales, development, product management, strategy, M& A. It's the nature of how I wanted to be, and I've been able to parlay my skills and my domain knowledge into all of these things. As you well know, I managed the SaaS operations team once upon a time. So I have lived deep in the world of SRE before it was called SRE and have a genuine appreciation for what it takes. All of this stuff goes to us as individuals to build, to be in our best selves. So in simple terms, I would say the recipe for the SRE omelet is find good people, provide meaningful work, a great environment in which to apply one's skills, master one's crafts and grow. And enduring, and this for me is a very important word. Enduring outcomes. And for those enduring outcomes, we have to look no further than our customer base. Ultimately, when I try to seek satisfaction in all I do every day, when I try to make sense of my frustrations, I look back towards our customers and the people we serve and what we enable. Before coming onto this podcast, I just had a call with a client who we've gone through some very challenging times together. I was actually joking, although it rarely wasn't a joke. I'm on some level closer to them than I am to certain members of my family, at least if you judge closeness by the frequency of conversation and how much we've had to speak and engage. But listening to them, listening to the work they're trying to do, what they're trying to create, and recognizing that is what we are enabling always gives me a sense of enduring purpose, a sense of enduring pride. That speaks to the ultimate outcome. It speaks to why I joined IBM. It speaks to why I remain in IBM, because I do fundamentally believe in my heart of hearts that what we do is essential and is meaningful to our customers. So put that in your frying pan, flip it a couple of times, and hopefully you'll have a tasty, tasty omelet at the end of it.

Kevin Yu: That is a fantastic capture, Kareem. I would say that is one of the main reasons why our customers keep on coming back to us. It is because of that relationship and trust we have built.

Kareem Yusuf: Indeed. Indeed.

Kevin Yu: So there you go, ladies and gentlemen, the SRE omelet for Kareem Yusuf, general manager of IBM AI Applications and Blockchain. Thank you so much, Kareem, for spending time with us and sharing your insights with the community.

Kareem Yusuf: Well, thank you for having me. As I always like to say, onwards. We've got much more to be doing. Stay the course, stay focused, and let's just keep bringing value to the table.

Kevin Yu: Thank you, Kareem. I'm looking forward to the future. And I'd like to thank you, ladies and gentlemen, for listening to the episode. This is Kevin Yu, principal SRE at IBM AI Applications. See you again on upcoming episodes.

DESCRIPTION

Kareem Yusuf, GM of IBM AI Applications and Blockchain tells the story behind the SRE Omelette.  He gives the audience insights into why SRE is essential in delivering meaningful services to customers and describes how he measures ROI (Return on Investment) for SRE.  Kareem also shares his thoughts on challenges SRE need to tackle next and gives us his recipe for the SRE Omelette.

Today's Host

Guest Thumbnail

Kevin Yu

|Principal SRE, IBM Sustainability Software

Today's Guests

Guest Thumbnail

Kareem Yusuf Ph.D

|SVP, IBM Software