Season 3, Episode 3 - "Where's the source of truth?" with Avi Freedman, CEO, Kentik

Media Thumbnail
00:00
00:00
1x
  • 0.5
  • 1
  • 1.25
  • 1.5
  • 1.75
  • 2
This is a podcast episode titled, Season 3, Episode 3 - "Where's the source of truth?" with Avi Freedman, CEO, Kentik. The summary for this episode is: <p>On this episode of Network Disrupted, Avi Freedman is here to discuss networking. Avi is a Co-Founder and CEO at Kentik, a network intelligence platform that helps businesses with network performance, reliability, and security. Avi is also the host of the podcast Network AF.</p><p><br></p><p>Today, Avi and Andrew discuss the challenges their customers come up against, how they can try to solve them, why cloud adoption won't get easier, and the merits of monitoring and observability. Listen now; you'll also hear expert advice on how to start adopting cloud the right way.</p><p><br></p><p>Let me know what you thought of today’s discussion! You can tweet me at @netwkdisrupted + @awertkin, leave a review on Spotify or Apple Podcasts, or email me at andrew@networkdisrupted.com.</p>
Overview of episode
00:33 MIN
Avi's background and why he started Kentik with his cofounders
01:12 MIN
What prospective customers are struggling with when they come to Kentik
02:00 MIN
How customers tend to think about adopting networking vendors, and ideas on wholistic integration
04:18 MIN
On-ramps that bring customers to Kentik: Internet centricity, observability mandates, and pure cloud or hybrid
01:30 MIN
Kentik's solutions: Do they have an on-premise world for the backend data
03:23 MIN
How hybrid cloud has become a generic term, and the challenges of networking being a constant land of changing protocols and technologies
03:38 MIN
Sprawl technology and reasons it's important to think strategically about network infrastructure with experts instead of amateurs
03:36 MIN
Differentiating observable and monitor able and observability and monitoring
03:41 MIN
The messiness of instituting machine learning and streaming telemetry standards in network observability
04:07 MIN
How a need for network observability is causing organizations to pursue detours to holistic monitoring
09:04 MIN
Advice to those starting to adopt cloud and cloud networking
01:00 MIN

Andrew: Hey, it's Andrew, and welcome back to season three of Network Disrupted, where I, along with some very smart guests, help fellow technology leaders trade notes and navigating disruption in our space. This season, I've set a goal of exploring the issue of enterprise cloud adoption from as many angles as I can. Today, I'm joined by Avi Friedman, co- founder and CEO of Kentik, who himself just launched a podcast. Good luck Avi. In this episode, you'll hear us talk about some of the challenges we see our customers running up against, the merits of monitoring and observability, why cloud adoption will never be easy, and how Russ White is just so right with his thoughts on automation, which I'll link in the show notes. All right, let's get into it. And if you have a moment, please, don't forget to leave me a review on Spotify, Apple Podcasts, wherever you listen to these. The feedback is always so helpful and you'll be helping more people like you discover the show.

Speaker 2: Maybe you can give me a sense of the complexity.

Speaker 3: We left the inaudible.

Speaker 4: Influences everything. It influences the human experience.

Speaker 6: There were several values along the way.

Speaker 7: We want to be early about the customer.

Speaker 8: You are handling sensitive information.

Speaker 9: Let's work this inaudible.

Andrew: Avi, thank you so much for joining me today.

Avi Friedman: You're welcome. It's great to be here.

Andrew: Why don't we start by just telling us a bit about your story at Kentik?

Avi Friedman: Sure. I started Kentik with some of my co- founders seven years ago to help bring modern techniques to network analytics operations observability at Akamai. One of the first things that I did back in October'99 was write some software to use our data and figure out where we should deploy and how to operate the network. And why I left Akamai 2009 was clear that that technology was not out there. I actually had started building some networks sensors and everyone that bought it said, actually, our big problem is analytics. How do we see the internet orchestrated automated world, cloud, on- prem, how do we make all this go? Because networking has gotten more diverse. So we're now about 140 people, 300 customers venture back company and focusing on just making network lives awesome.

Andrew: Yeah, that's amazing. And I think as you know, what's happening inside the internet, let's say, I don't even know if that term is still valid or cool, is basically reflecting what's happening on the broader internet now with the technology is out there. So you can't just monitor things the way you once monitored them if you care about application performance, user engagement, employee engagement and everything else, but that's fantastic. So what do you see... When you meet with a prospective customer, what are they struggling with?

Avi Friedman: So often they're struggling because they have five different kinds of networks that they run, whether it's a WAN, an SD- WAN, a data center of architecture number one, cloud number one, cloud number two. And then of course there's things that they don't run like the Internet or the SAS applications and all these things wound up, as you say, affecting employee productivity and performance and agility, revenue, and application performance. So bringing everything together in one place is a big factor, understanding things that are beyond your border, so the Internet, being able to sync with systems that are systems of record like IPAM and inaudible and CMDBs. That's really important because the days where we could memorize all the IPV4 addresses and name machines Fred and Wilma and Jason are past. So appliances, we have to statically go in and configure things don't really work in that kind of environment.

Andrew: Right. I still do have some retail customers where if you name an IP address they'll be like, that's the printer in our Piscataway, New Jersey, just because the step would repeat nature of it. And some really interesting sub- net math, but regardless yet it always occurs to me we're in this world now where the large networking vendors are consolidating like crazy. Oftentimes they do things in a proprietary nature. So they're doing, for instance, network monitoring or user experience monitoring or whatever, application performance monitoring. And sometimes those things prefer an ecosystem that's mostly their world. And sometimes they work, especially if it's through like M& A consolidation across multiple stacks. It occurs to me that those vendors usually, because they've got their relationship, we're already buying$ 100 million a year of switches or whatever. We just made this massive nexus 9, 000 acquisition from Cisco or something, and so we're just going to use what they have. Do you find customers realize that it's not even necessarily best of breed versus whatever my predominant networking vendor has to sell me. I think it's more so... My hypothesis is more so like you said. I mean, there's all these different networks and those networks are probably not through the same vendor. And so how do I look at all this together? But I'm curious, if customers tend to try to default to large networking vendor, whatever they have.

Avi Friedman: So the large trucking vendors have never been super great at the software side, which is ironic since it all is software, ultimately that they inaudible. But the monitoring plan, the management plan, and I think what really is a stark fact is that the BUITIS that some of the large vendors, especially the biggest one has, makes it so that breadth. So even within the kinds of networking that a given vendor may sell technology for, if the software is out of a different BU, they also have the software that maybe works with APM and the data center, but doesn't see the LAN. Or it sees the LAN, but doesn't see the Internet. It's your way and not the Internet. Or sees security, but doesn't understand internet or application. And so that's a pretty common... I think it's true across all the vendors that we see, and so we also see... So that's the major factor. Also, sometimes what we see is as we've grown is people saying, well, that company bought this startup. They're not going to work like a startup anymore. I'd rather work with.

Andrew: For sure.

Avi Friedman: It's not even as much, " I hate vendor X." It's, " Hey, if you do a good job at this, I'd rather have it all integrated." And again, they may have six skews that do a network monitoring observability or something. And generally there are more monitoring than observability. They're not API first big data platforms, but you left out. I mean, who are the biggest network vendors out there in some sense, the biggest SDN vendors, VMware. VMware has realized network insights, they have Wavefront, they work with the VMware ecosystem, especially well, and then the cloud. Look at how much network there is, and their pools or Google does a little, but most of them are pretty, if you take VPC flow logs in the cloud pools, they look like just log, which is not how people want to use it. So that's an exciting place to try to help solve. But it's also frustrating for customers. We see that a lot.

Andrew: No. No. For sure. For sure. I think that's what's the changing nature of the requirements are because of those things. We now have these overlay networks or we now no longer have access to the underlying networking equipment. This is all virtualized and that's just going to continue to get more complicated.

Avi Friedman: Yeah. And kind of networking that we do is eBPF on the host, right?

Andrew: Yeah.

Avi Friedman: Sometimes if you're really virtualized, that's the only thing you have access to. You're running containers or VMs doesn't help you. Then you need to do synthetic transactions. You need to find a path, because it's not your network, and put it all together. I think it was Russ white, I don't think he said it was his quote, but I was sitting next to him and he said, automation does not mean simplicity.

Andrew: Yeah.

Avi Friedman: But abstraction does not mean simplicity from operations either.

Andrew: Actually, he writes quite a lot about that. And he's very insightful about this whole idea of simplicity and how to abstract that in a way that's meaningful versus, " Oh, it just works? Well then it's probably not going to work for me."

Avi Friedman: Well, that's a separate topic, which is we see a lot in startup LAN, because we do work a lot of web companies and companies that were recently startups as well. The new hotness I need to use, and it's like, well, if you ask her, they'd never can people, they don't want to put it in production until they know how it breaks because everything breaks.

Andrew: Yeah. We talk a lot about that. I talk a lot about with customers as well, but it's just the whole idea that wisdom comes from failing, breaking stuff, real experience, not from something brand new is out or even things like cloud, which companies are just investing in. You can hire some people, some experience, but there's not a lot of people who have the wisdom yet. It's just-

Avi Friedman: Well, experience is what you get when you didn't get what you wanted.

Andrew: Yeah. No, for sure. Is it the cloud aspect and the virtualized network aspect that normally... I mean, I would imagine starts bringing customers to potentially looking at Kentik or your solution. And obviously there's a lot to be said for observability in a traditional network, but I would imagine whatever was being done, which was probably SNMP polling or whatever was going just obviously isn't working in the new world and they don't know what's happening. There's a blind spot there. Is that the intro to the conversation usually or?

Avi Friedman: Yeah, there's a few big on- ramps. One of them is Internet centricity because most of the existing systems are blind to the Internet, which are SQN sitting on top of. That's how you get to SAS, that's how your employees are getting to you, that's how your customers are getting to you, that's where the CDNs are. So another is people have an observability mandate and they're saying, well, how do we bring the network into this? And they'll find us through the New Relic integration that we've done, or our integrations with Splunk or others in that ecosystem. And then the other is pure cloud or hybrid. So typically if you just have a leaf spot and data center and you have whatever you have, that's not typically the entry point. We may add that based on these other use cases. But as soon as you go hybrid and you're doing data center migration, and you're trying to understand, well, what's still there, what's the performance between this app, which is at least DR if not AJ sitting between my data center and a virtual data center that someone else is running on, that's a big use case too.

Andrew: Got it. And your solution is delivered as SAS, or are you... Obviously there are sensors or whatever the case, but the backend data, intellectual, that stuff, or is there an on- premise world for you on the backend as well?

Avi Friedman: Well, interestingly, we actually-

Andrew: By on- premises I just mean in customer managed.

Avi Friedman: Thank you for saying on- premises or crosstalk.

Andrew: I'm a stickler for that by the way.

Avi Friedman: Yes, I'm crosstalk.

Andrew: I'm super annoyed that the definition has now changed of on- premise, which just-

Avi Friedman: Oh my God, first, they lowercase internet, the proper noun because there's only one of it. Then they made a figuratively mean literally and now on- premise has become crosstalk.

Andrew: Yeah. Exactly.

Avi Friedman: I think I need to... I don't drink, but I think I need to drink inaudible.

Andrew: Right.

Avi Friedman: We actually don't require any agents. You can just send network telemetry directly to us. People often deploy a proxy because UDP telemetry, they want to encrypt and NetFlow SNMP, well, SNLP a little bit. But our service, yeah, it's a very big database, you're not going to put it on a laptop or in a VM running on your infrastructure. But we have two ways of doing that. So most of our customers are on our larger clusters that we run, that they can get to over the internet, or be pure with a number of our customers, which is awesome because it means we get transit from our service provider customers up here with us actively to their customers. Or we'll run a single tenant copy. So often you see in a financial company their security group says, no, I'm sorry, you can't take this data, it's PII, we have IP addresses. Or a national telephone company says, this is critical infrastructure, we can't send it out. So we don't do licensed software. So we don't sell you the software and have you install it because we're maintaining it, monitoring it, we're upgrading it daily. But if you need it in your own demean, then we will do that for you.

Andrew: Yeah. Got it. Now, so I bring that up because of an interesting thing that we have found. I mean, first of all, just the likelihood that customers, even large financial services companies will accept a SAS based solution has changed pretty dramatically throughout the years, but still there's a point... And yeah, you can do it single tenant or not, but there's always this point where they're comfortable with these larger vendors or this other solution, but for whatever reason, not necessarily comfortable with what you're doing, what you're storing in the cloud, and obviously there's governance, but it still seems to be a hurdle. But you're like, wait a second, you're using office 365. So like, okay, here's your network telemetry, what's more risky? That, or every email and the entire... But anyway, regardless, I think it's definitely crosstalk.

Avi Friedman: Yeah. I mean, we have SOC 2, we log all the queries, they can get a copy of the logs, there's audit trails. Yeah, it's a multi- tenant system, but again, we can run a single tenant copy. But sometimes we do run into people and they say, no, no, no, it needs to be, you can't ever have access ever, even though the dirty little secret is many appliance vendors basically have the equivalent of a modem hooked into the back that the tech support can get into whether it's an SSH tunnel or an actual modem, which I know some of them still exist. So what really difference? It's still just software.

Andrew: For sure. For sure. And let me point out that my employer, BlueCat, we do not have a backdoor into our customer's appliances.

Avi Friedman: I don't mean backdoor. I mean, basically-

Andrew: Or even front door.

Avi Friedman: I mean a front door where you can say, turn this on so I can support you. And then some people just put it on because it's easier.

Andrew: Yeah.

Avi Friedman: But they pretend that it's appliances and that no one has access even though.

Andrew: Yeah. Right. For sure. The term hybrid cloud, speaking of on premises, is also now fairly generic. In other words, I hear it used all the time to just mean we still have data centers and we're using public cloud, versus we're implementing technologies such that we can, whether it's DR or whether it's just peak usage scale- out type use cases where we can have compute similar applications, similar workloads moving between these domains. So from that hybrid cloud perspective, and maybe it's more just around virtual networking in general, but how do you think that has the broader networking landscape? We've talked a little bit about just a lot of this technology is new, but the promise of the technology certainly on the virtual network side internally was always like, well, this will be easy, you don't need to make all these changes, just deploy this. One of the vendors used to have this little video, this explainer video where the business was asking the network admin to do a bunch of stuff. And he was like, I need six months in this amount of money. And some other guy was like, " Oh, but we can just deploy this and we can do the change immediately. So I think there's a lot of naive assumptions over what that actually entails. How do you think that's impacted broader networking?

Avi Friedman: Well, networking has always been a land of change, which generally people view as good and the move to infrastructure as code and being descriptive is a great one. And in some sense, cloud is just other people's infrastructure as code with funny names and terminology, but underneath it, there's tunnels and routing and forwarding and switching and all that stuff. So the challenge is the way that cloud networking works or with the default primitives in Amazon is different than Azure, is different than running your own data center, with major vendor and protocol is different than if you used a magic cluster technology, is different than if you use Sonic and write your own stuff. So the good thing is it's an opportunity for learning. The bad thing is the wonderful thing about standards is there's so many to choose from. So what we see is the more successful clients of ours pick a couple of standard architectures, whether it's in the cloud, scuffled vendors, or as they evolve their own on premises infrastructure. The people that we see get into have more issue corralling everything or operating it, we'll have 15 different... Well, that group wants to use the Arista as the thing and not use the substrate. That one wants to use the virtual MX, that one uses transit gateways, that one peers onsite, one entity that can get difficult, not to take the telemetry into one place, but just to operate. Because ultimately you still have to get into that to do some levels of debugging when you form a thesis based on what you're seeing. but we see an awful lot of hybrid even not necessarily HA, so yes, we have cloud, we have content provider and web company customers, and some SAS customers that will scale into the cloud, the same app, logically network extension just because they need to cloudburst, that's not the typical pattern. Usually these are things that are networks that peer with each other or connect to each other and are operated differently. You just want to help the people that are doing that operations have as few concepts to get around as possible

Andrew: Right. And I couldn't agree with you more just on the likelihood of sprawl of technology for either, maybe it's just because there wasn't some up front planning and thought, maybe it's because different business units or different parts of a business we're solving independently, but sometimes it's just people want to use the next cool thing for whatever reason. And that creates some serious problems.

Avi Friedman: There was another factor, which is network in the cloud is often turned on and set up, not by networking people, with API developers, SRE, people that, again are... that's great from an agility perspective, but that also can be a challenge because sometimes network people don't have access to that telemetry. They need to go begging and borrowing. That's why we're open sourcing our agents. And if eBPF is all you can get, let's give that value to them. And then bring that into the network systems, bring it across observability. So that's a cultural challenge just like security and net ops can be in some companies too.

Andrew: And we've certainly seen it and talk to some people on this podcast and definitely talk to customers about some real downsides of that challenge, where, yeah, look now anybody can go deploy a network. That's, there needs to be architecture, there needs to be upfront planning, and I think there's just too much discounting of the wise network engineer architect that's been around forever because he's speaking language we don't know. It's way easier in Amazon. You just go to this VPC thing and click Piering and except over there and the networks are peered. Obviously there's ramifications. The other aspect of it is what a single cloud provider provides today will mature faster than I think most enterprises are used to in terms of technology adoption. Like a good example is just with AWS, with best practices and VPC peering turned into Transit Gateway has turned into, and so you're-

Avi Friedman: We have an e- book, there's a gentleman at Intuit who's very helpful, who's been there for the whole thing and laid us through, here's the evolution of the eight generations and here's what was wrong with each generation inaudible, you still have these caveats and this can do overlapping IPS and this can't. So we have an e- book about that, which was very helpful to me because it can be confusing if you just look down, look up. Yeah, it's changed.

Andrew: Yeah. It's changed. Right. And then if your mechanism to learn as is dated Courseware or Googling, you'll find the thing that works and do it, even if it's not the best architecture for the company and because you don't need a separate network lab anymore. It's easier to solve by trial and error, which is rarely the best way to solve something.

Avi Friedman: Yeah. I mean, it's better to understand. And on the flip side, labbing is easier now. We as a community need to do better at making the kind of advanced debugging that you're challenged with in a CCIE, which I finally heard about, I don't have, how to reason about things when they're breaking and do the debugging and all that is really important, but harder to get in an abstracted world. So the cloud that... We should use the other side, which is the benefit and use this, the flexibility of virtualization to build open environments, to let people learn, to help bring them in because that's a frustration I've heard from people wanting to break in is what I have to buy stuff on eBay, but that's not modern stuff. And how do I do this? How do I learn?

Andrew: Yeah. Agreed. All right, let's go real basic for a second. What is the core difference between observable and monitor able or observability and monitoring?

Avi Friedman: So there's a lot of different takes on that. Observability from the engineering senses, the ability to infer what's happening from the thing being observed by its outputs. So think of a tomography where you're looking at a bunch of vectors on signals and you're trying to figure out what some structure internally is doing. From our customer perspective, from very early on one of the first appliances we competed with with Arbor, it only did roll- ups. And so it was like taking the SAT. If you told it the questions you were going to ask it, it could answer them, but if you need it to double- click or pivot or whatever, it really couldn't. I tend to also think that monitoring has a lot to do with the things that you know go wrong and things you know to look for and taking those telemetry streams, doing the right things with them, being able to really do observability means I've got all these outputs and maybe the way I was looking at it, it doesn't get at it, but me, the human go in and do my diagnostics. Like we do proactive notification, base lining learning over the data. But if you're going to affect something that could take your business down, sometimes you want a human to look at it and say, yeah, I believe this, this is really the thing to do. And for that you really need an observability platform that can see a wide range of telemetry, keeps it, does the right things with it, but also lets the humans interact with it. That's the way that our customers think about it and one of the differentiators that we think of. Now, some people say observability means you have to ask the question and you're not a real engineer unless you know what questions to ask. I don't buy that. For an enterprise to get value from these platforms it's building, a knock technician, a CFO, an accountant, a sales person, network operations, everyone should be able to use the platform, which means you do need sometimes just dashboards and maps and things like that and you also need the experts to be able to dig in all the way down to the detailed data if they need it. How do you think about it?

Andrew: Very much similar. I also don't think of it as an either or, and I think too often it becomes a one versus the other versus the value of one versus the value of the other because in the world where you're getting the telemetry doesn't mean you should stop monitoring because there's questions you do know that are specific to that device as opposed to any correlation or the telemetry streaming off of it, the power supply is letting you know it's got a problem. I don't want to predict the power supply is having a problem based on seeing telemetry data. I would love a alert, an SNMP trap, building to monitor and just know that, you know what I mean? So I think people think of them as one or the other versus both valuable.

Avi Friedman: Yeah, I think we'll move to being more of a continuum. Now I'd say it's more synergistic, the way that you can take the different approaches and combine them.

Andrew: Yeah, no, for sure. I see it starting to show up on things like RFPs. The word is there. They're asking if it's observable, and sometimes it's okay. Well, of course it's observable, what are you? And I don't mean that by the, you can observe it. I mean that we're generating tons of data.

Avi Friedman: They mean you have the right telemetry.

Andrew: When the question is asked that way, I feel like they're just asking based on the buzzword.

Avi Friedman: Right. One of the challenges in the network world has been that it's really funny when you know three or four people that have caused such influence. Sometimes it's positive and that's awesome. Sometimes it's negative. In my view, streaming telemetry, the hype around it has been causing a lot of wasted energy. So the idea before observability was a term was, " Hey, I don't trust you Mrs. Vendor, I'm going to use white box unless you open up Cisco Juniper or Busta, and send me all the data from the box and I will make sense of it." Now we have probably 50 customers doing streaming telemetry. Instead of in addition to SNMP, zero of them do less than 32nd export because they'll explode the box. Exactly Facebook and Microsoft as far as I know. Netflix, I think would like to do something on the sense of Ooh, magic ML, I see all these signals and maybe none of them are doing the fantasy of, oh, I can detect inconsistent forwarded because I'm looking at all your route updates and seeing that this route table is different. Like for debugging, maybe, but it's observable, I'm going to do this magic ML stuff. And in the end, if you go to the big vendor presentations, they say, take all the streaming telemetry, put it in influx and query it manually when you need something, which actually makes it less usable than MSNP in traditional old monitoring systems that people make fun of. So that's one of the things that we think a lot about, is how do you unify it? Because the same thing inside you could get SNMP streaming telemetry, API, CLI, there's some things you still have to screen scrape to get optic temperatures and stuff that isn't elsewhere, and how do you provide that as a bus and help Splunk or New Ralic like a traditional observability platform see it, and how do you help networkers, even if they're going to DIY. And the thing is, there's no incentive for any single vendor to solve it. So as a representative of the observability industry, I'll say that it will take it on ourselves and try to work towards it. The vision has caused in some ways, like the first version of streaming telemetry forgot the MIB. Every vendor had their own semantics and that was a huge pain. Now there's open good favor and trying to combine it, but it's still a little bit messy.

Andrew: Yeah. And it's still that thing streaming the data versus something next to it or something that can see. Like, I don't need you to tell me, pretend for a second you're some switch out there, certain things because I can just observe them because I can calculate how much socket connections are being opened or how much traffic streaming off not being on the box. And so the whole idea of of a sidecar, which I'm broadening, but I don't need to be there, I don't need to be unboxed to stream telemetry to observe something.

Avi Friedman: Yes. But we think of one of the things that we do is application where network observability, like yeah, if you're running a sidecar and you're running STO to coordinate your envoys, we can take, we have a plugin and we can take that. And some people go, oh my God, that's not network traffic. It's like, well, it doesn't have a TCP flag, but it's got some bites went from somewhere to somewhere and we actually know more data about them and their performance transaction name, but that's still, I would say a work in progress, like most observability platforms. Well, not New Relic can, don't really take the underlying network infrastructure data, most network projects don't try to bring in those other sources, but you know what, RUM data plus BGP equals peering performance. And again, it's not from, and then maybe it is better to do that than NetFlow from the router. Some people think that's a heretical viewpoint, but you and I both come from the world where before the specialization of nerd into CIS admin, networking, architect, security, general view take the data where you can and use the right data to solve the right problem.

Andrew: Yeah. Right. And the opportunity to figure out how everything works. So tying this back to cloud... Obviously things like observability are core to cloud native architectures, and so as I'm building new applications, things I'm constantly thinking about. And as you were just discussing, in the world of on- premise, things tended to be different historically, but do you see the... Look, this has become essential at this point. In other words, as I'm driving cloud network, this is something I need as opposed to relying on whatever the cloud vendors have, or if companies aren't going with you, they're going with somebody. I've got to be able to collect this data and understand the performance of my stuff. We're at the point where people are having problems, so they're buying solutions. Or as they're thinking about cloud networking, they realize they need these components as well.

Avi Friedman: Yeah. So there's an awful lot of suffering in the dark more than I would expect, but I come from the background of being a toolmaker. So when I ran a Usenet company, I had SIS log that took every transaction, had a token of the user at transaction time, and then wrote something that took the SIS log, combined it with BGP and showed me where I might have current performance. But first of all, you have to realize most enterprise, especially corporate IT isn't resourced for everyone going and building their own tools. And they're brilliant people, but often very interrupted driven and don't get the flow that you need to be able to go do that if you're in a web company. And a lot of people try these detours where they say, oh, Datadog has never come to sever ability. And then it's e- BPF, which means if it didn't turn on the kernel, it doesn't exist. You have a virtual F5 or virtual Arista, whatever it doesn't exist. You want to see the routers and switches underneath, you get SNMP. But that may be the platform that people have in the company, so they'll try to do that. Or again, they may keep begging the traditional appliance vendor to go do it, or they may have started whether it's elastic or influx or whatever, trying to pull up their own. So we see an awful lot of, we're coming in and especially on cloud Greenfield, like when you're doing performance testing, there's a bunch of vendors out there that they do performance testing as SAS. You're doing the core NetFlow, SNMP. There's a bunch of... It's more appliances, there's no SAS. On the cloud side, we see a lot of people trying to use CloudWatch. And again, they're like, I that's like logs, I can't. I want to see my network. They're trying to do their own and there's a lot more Greenfield or they've used Sumo or again, multi observability platform. That's changing. And I think e- BPF as a wave on the cloud native side, some of the visibility of the sidecar type things is bringing people into it, but you still have the internet underneath, you still have diverse network architectures. And our hope is that whether it's partnering with us or some of the open source that we're doing, observability platforms will come. CNCF will go down the stack a little bit and that'll help in terms of open telemetry and observability. But again, it's amazing how many customers we talked to that are for cloud, specifically VPC flow logs or e- BPF. And they say, here are our problems, we feel there should be something, but we don't have a good solution for those.

Andrew: Right. Got it. So they're thinking ahead.

Avi Friedman: Yeah. And look, some people made millions or tens of millions of dollars investment in packet copying network things. And those vendors, their answer is, mural your packets in the cloud and take our things there. And then it still looks at it like Atlanta or VLAN, not how does everything connect? And so sometimes we see the cycle of people have those investments and refresh cycles that they can be limited by.

Andrew: Yeah. And just can't adopt new technology that quickly, because even if the cloud is Greenfield, they're dealing with just a massive amount of a huge machine that is IT that just can't do things faster for whatever reason.

Avi Friedman: I mean, it's really interesting because I see a lot of organizations that are trying to live with the duality of ITIL and DevOps. And they're enemies of each other in some ways, but the truth is, as networkers, we may make fun of people just turning on networking in the cloud and whatever. But when you were at a certain scale, if you don't have any process and everything is wild, wild west and infinitely agile, you wake up in three years and nothing looks like anything else. And it's those problems we're talking about. So I don't pretend to have the magic answers be. That's not the way I've taken my career. It's harder in some sense that some of the things that we do, frankly.

Andrew: We can have a whole discussion on different methodologies, agile, should DevOps even be a thing, whatever, but my point is what companies should be attempting to do is finding mechanisms to drive smaller change faster. And in order to do that, stepping away from your like six Sigma black belt project managers, where there's going to be a maintenance window and you're going execute this procedure, and these people are going to go test and make sure that it was executed correctly, and here's how you're going to make sure it worked. And then if it didn't, here's how we're going to roll back. And one hour before the end of the window, we're rolling everything back if it's not done to a world where you're making smaller changers faster, which are therefore easier to consume, not like a small change has taken down many accompany over the last year. But I think the point is, with some level of observability, you can immediately understand the impact of the change beyond the, we made the change, we went home, Asia came online at 2:00 in the morning, whatever, and this application wasn't working, oh, we didn't think about that application. And so I think that's part of the power. In order to enable rapid change, small or large, you need to understand the impact of the change on the system and those methodologies, which are a whole lot easier in Greenfield cloud, whatever the case, it's hard to do that if you've got a load balancer with 1, 000 applications behind it. Way easier to do with a small little application load bounce or something. But regardless, I think that's an area where you need the data. You can't assert the change worked by running a set of tests that might be faulty on their own.

Avi Friedman: Even before we added ML to the platform, one of the very first set of things that we started alerting on was something has vanished. Something that used to be an application pair or a top remote network has vanished. How are you going to see black hole? And you see it by its absence. It's, how would you see something stealth? You look for light behind it and see that there was nothing there if you read science fiction, physics stuff. And that's really important. Now, at scale, it's not a magic bullet. I remember when we started working with Yahoo and I said that's a life cycle which a lot of our customers do. I made a change, it's performance, okay, are all the applications sending traffic and volume I expect? Is to distribution to the internet of countries and networks the right way and use to do that as part of their check, or somebody will take that data into a service now ticket? Even if it's not a network ticket, they'll take the network data, they'll take the application data and enrich it so that someone can look at it internally with context acumen. And then they said, well, how many changes do you think we make? And I said, I don't know, 500 a day. He's like, no, 1, 000 a minute. Now, that's an aggregate across all properties and everything and metadata, but still there is a point at which that can be hard, but for most, even pretty decent size enterprises and service providers, there isn't that much if you focus on performance applications. You focus the dimensions, you can see whether there's an effect, whether it's positive or again, black holing negative. And you need to be able to look across because maybe that traffic moved over to the cloud. So if you're only looking at your data center, it's going to look like there's a problem.

Andrew: Yeah. What happened to all the traffic? Where's all the blinking lights? Fantastic. Look, I mean, I think observability what you're doing, your platform, just understanding the network that is now being carried over, well, it's still being carried over traditional stuff. But it's being encoded in so many different ways as a critical part of this journey to cloud that, as we said earlier, I think the vendors out there are in many ways pushing this idea that do this, it's easier. And I think that's one of the fallacies of the industry.

Avi Friedman: The Russ White quote, crosstalk right. It's not magic. I mean, it can be easier to consume, but you have to look at the full life cycle of it.

Andrew: Yeah. Right. Exactly. So it's easier to consume. You have to look the full life cycle. So now you've actually created something that you thought was easy in is hard and then there's the series of vendors out there. They're saying Kubernetes is a good example. Kubernetes is actually hard.

Avi Friedman: Well that's K0SK minus 1S, there's the minimum minimal. It's the helmsmen in a straight jacket.

Andrew: Yeah. So it's hard and we've got a product to make it easy. So you end up buying just layers on top on layers and layers. And I'm a strong advocate of looking for capabilities that work across that the hybrid cloud or across the different types of networks that are out there versus this tool, propagation and complexity that can really drive downtime. But anyway, regardless, it was an absolute pleasure talking to you. I feel like we can talk forever at this hour. I don't know how long we're going to cut this thing down to, but we've talked about an hour and I still have questions about some of the books behind you. But with that, I think it's a good time to wrap in. And any final word or advice out there for companies that are really starting to outside of buy your product for companies that, that are starting the transition of adopting cloud and cloud networking.

Avi Friedman: I would actually go to your domain and say, think about as much as possible focusing in on sources of truth because in a world which is hybrid in a world, which is orchestrated, if you have conflicting systems, part of observability is the metadata about what is running, what shouldn't be running. And thinking carefully about that, it's not just about networking and technology and terminology, but also being able to unify with applications, users, customers. And one of the biggest challenges we see with some sophisticated customers is we say, look, we have the ability to take all that as input where's the source of truth may just start laughing and maybe eventually they finish at some point.

Andrew: Yeah. Right.

Avi Friedman: So NETborg was a good example of that on the open source side, but UNDDI CMDB using all that properly, I would just encourage people that that's a really important part of the network architecture actually now days.

Andrew: Yeah, well said. I'll take that advice. All right.

Avi Friedman: Thank you.

Andrew: Good to talk to you again and thanks again for joining.

DESCRIPTION

On this episode of Network Disrupted, Avi Freedman is here to discuss networking. Avi is a Co-Founder and CEO at Kentik, a network intelligence platform that helps businesses with network performance, reliability, and security. Avi is also the host of the podcast Network AF.

Today, Avi and Andrew discuss the challenges their customers are against(how we can try to solve them), the difficulty of cloud adoption and why it won't get easier, and the merits of monitoring and observability. Listen now; you'll also hear expert advice on how to start adopting cloud and cloud networking.

Let me know what you thought of today’s discussion! You can tweet me at @netwkdisrupted + @awertkin, leave a review on Spotify or Apple Podcasts, or email me at andrew@networkdisrupted.com.

Today's Host

Guest Thumbnail

Andrew Wertkin

|Chief Strategy Officer, BlueCat

Today's Guests

Guest Thumbnail

Avi Freedman

|CEO, Kentik