Kentik - Network Observability
More episodes
Telemetry Now  |  Season 1 - Episode 31  |  January 23, 2024

Total Network Operations with Scott Robohn

Play now

 

The network is how we deliver the services and applications we use every day at home, at work, and when we're on the move. That means the network is more important than ever, but it's also more complex than ever. To effectively manage this entire system means having a holistic view of the network, or in other words, Total Network Operations. In this episode, Scott Robohn joins us to talk about TNops and why we need to rethink how we approach network operations.

Transcript

The funny thing about networking is that pretty much everything we do with computers today relies on it. I mean, most of the applications that we use are to us over a network. And of course, all the online activity like banking and scheduling and entertainment and shopping is all done over a network.

Even the way that AI workloads run today relies completely on a really, really serious data center network.

Now the thing is that this network, this system is made up of a ton of many, many parts, so many different components, services, and even components that we don't usually even think of as part of the network.

So how do we understand what's going on with this system? And how can we actually run this system and make sure that applications get to where they need to go?

This is the idea of total network operations which is sort of a network operations framework being developed by Scott Roman, a veteran network engineer, veteran trainer, and a co founder of the network automation forum. And in this episode, we'll be unpacking what total network operations is and why it's pretty much necessary to run most networks today.

My name is Philip Gerasi, and this is Thomas Renee.

Scott, thanks for joining today. Really appreciate it. And it's really been great to get to know you, over the last few months or so, especially with my interaction at the network automation forum, having attended Autocon, in Denver recently, having some really good conversations kind of, you know, in the hallways off to the side and And then since then, even actually, wait. I take that back. We met prior to that, like, a year ago.

I was gonna say, do you remember? I I actually reached out to you about producing podcasts And that was our very first conversation, I think, just about a year ago.

So yeah. Absolutely. I do remember that very much, actually. And so, yeah. So and here we are.

Thank you for joining. Before we get started, though, I'd like to, like, give you an opportunity to introduce yourself to the audience. I think folks are probably pretty familiar with who you are now because Autocon was a huge success. And it was all over tech news, tech media, the tech social media influencers and all that.

But in any case, maybe a little bit about your background as well, and how you came to be where you are today.

Sure. Let me see right back at you in terms of getting to know you and spending a little more time especially as we worked up to the NAF event in Denver and since then. All good. I've got thirty plus years in the industry I hate to say that.

We don't have video associated with this, but you'd see the gray hair so you would know. And the grandchildren count keeps going up. So things I'm very, very, very thankful for. So I had a formal education in industrial engineering, which many people like to joke about being imaginary engineering or the liberal arts of engineering.

There you go.

But it gave me a systems view that has come back very helpfully recently. We'll touch on that later. But I came out of school bitten by the network engineering bug I did some projects in grad school that made this networking thing really interesting.

My first job out of university, I basically found ways to do networky things, as side projects, in addition to my normal duties, and that helped me land at Bell Atlantic, you know, part of the one of the the baby bells as a precursor to Verizon, you know, went from there, not directly, but spent a lot of time at Juniper Networks. There's probably some interesting, conversation we could have there about Juniper and HPE today. After working for corporate employers for of those thirty plus years, I really wanted to, break out and do my own consulting work, which I planned for eighteen months ago and pulled the trigger on just about a year ago. And, that's what enabled me to have the flexibility to work with Chris Grundham and then co found the network automation from. So White knuckle ride of, of my career so far.

You mean the last eighteen months and then your work with NAF. Right? Correct. Yep. Yeah.

Yeah. Yeah. Very cool though. And I'm really glad that you and Chris chose to do that.

Maybe we could talk a little bit about why you chose to do that as well. But I will say that I'm glad because I think probably one of the biggest networking focused events out there, like, specifically networking, right, network engineers? It's probably Cisco Live. And, obviously, it's very Cisco focused because it's Cisco Live, which is totally fine.

I built my career on that. But I loved it. And I still do. I haven't been in a couple of years.

But there really aren't that many enterprise networking events out there conferences, small or large. There haven't been aside from that And now I'm starting to see, you know, with auto con focus on network automation, for sure. But still, I I I remember going there and sitting there in in that big hotel conference room, what whatever they call it, like, the ballroom in Denver. And I'm like, I know, like, the third of the people here, you know, these are my people.

And so that was awesome. And then, of course, the the the local network user groups that the US NUA is now just exploding in popularity. I that you are leading the very inaugural Virginia network user group. Correct?

Correct. I'm sure what will have happened by the time this publishes but, Thursday, January eighteenth, which is one week from the time we're recording this. We're ready to kick it off in Northern Virginia and and the DC suburbs. Very, very excited about it. I think we have thirty five forty folks registered so far. We'll see who who shows up on that evening.

Yeah. That's fantastic. Really glad to hear it. I think that that along with, what we had in that dev ops days.

We had auto con. There there does seem to be this new resurgence of definitely smaller but high quality, very high value events, very specifically for networking enterprise networking, and maybe again, you know, more on the automation side or or more on, I went to an SD WAN thing that was obviously specifically about WAN routing and SD WAN. And I'm starting to see those, and and that's awesome. Because there are just so many on the service provider side.

There's, I mean, there's, like, a DevOps stays every other month or every month, I think. So there's so many other things going on out there that cool to see our niche of technology, networking in particular, really just growing, at least the community growing and coming together again.

Right? No. I totally agree.

Yeah. Yeah. Your background is mostly on the service provider side because you mentioned Bell Atlantic, so I'm just taking a guess here.

Correct. Full disclosure. I actually worked building and operating the enterprise network within Bell Atlantic, but obviously got exposed to What is the Sonic STH thing? You know, t ones, t threes, things that no one recognizes anymore, but, got a good start there. It was a great place to learn a lot.

Yeah. I learned about all those things, but I admit I didn't use a lot of that.

Right.

And you're right.

This is a this is an audio only podcast. So nobody can see the gray. But to be fair, nobody can see the gray in my beard either. And I have a little bit.

It's salt and pepper now. But what I have found because I shaved my head, right, the past few years, I have found kinda like Patrick Stewart in Star Trek. I get older, you can't tell as much. So I I will claim that as a benefit of losing your hair and then shaving it.

Literally daily. You can't quite tell how old I am.

So I will resist the temptation to do a John Luke Picard right here.

But, I'll keep that in mind for future use.

Yeah. Yeah. Okay. Very good. And I also see on your hat, you're wearing a syracuse hat.

Absolutely.

Yeah. So you're from Central New York originally or you live there?

Born and raised in Syracuse, went to school in Rochester. You know, a hundred and seventy two inches of snow a year, baby.

Bring it on. I'm sure they don't even get that much anymore there due to you know, weather changes.

But, Yeah.

Good place to be from, is the way you like to put it. Yep.

Okay. I'm from, outside New York City, but I live in the capital now, which is about two and a half hours from the Syracuse area. I live in the suburbs of Albany. So about half an hour from Albany.

And we don't get the same amount of snow. I find that the lake effect snow from the Great Lakes, sort of stops around Syracuse, Utica, doesn't quite get to the Hudson Valley where I am. On the other hand, though, we do get sometimes the last remnants or sometimes, you know, a big chunk of the New Orleanssters coming off the coast. I was gonna say.

Especially out of, you know, off of the New England So we'll get that and not necessarily the the lake effect snow, but we do get our snow here. And it is cool to drive along ninety going westbound. I love it. It's beautiful.

Central New York is beautiful for those I don't know. Really, especially in the summertime, it looks like you're in Northern California. It's very, very pretty.

Yep. I do. I miss that part of things for sure.

Oh, yeah. For sure. And you'll see like, the giant plow parked on the side of the, the highway and a big sign that, like, gives you, like, this date, we had this many inches of snow and, it's a lot. Not quite like Buffalo, I don't think. Rochester too. Rochester's a lot of snow.

I would put it this way.

Syracuse and Buffalo are situated perfectly to their respective Great Lakes Mhmm.

Where weather systems come across soak up moisture and dump it on Buffalo and syracuse.

Mhmm.

And Watertown, North of Syracuse. Rochester is lucky. Just like you don't get it off Ontario. They don't get it off eerie as much. But it's cold and windy in Rochester.

Yeah. I'm surprised to learn that because Rochester is right on the lake.

But it's it's it's directly south. Right? And it's that west to east motion that picks up the moisture and dumps it.

Thing. Okay.

Alright. I'm not a forecaster and I don't play one on television. I've probably exhausted the depth of my meteorological knowledge. Right.

There you go. Okay. Alright. Alright. And so you you don't live there anymore. You moved out of the out of the area, and that's when you began your career in Bell Atlantic.

I think you worked for Juniper for some time or for quite a bit of time. Right? And, and then you decided to go on your own, which I respect and envy to an extent. I think any hard driving kind of person in tech, we feel we sort of work for ourselves Right?

Mhmm.

Regardless of the fact that I'm a w two employee, I'm building my own skill set, which I take with me. You know, it's in my brain. So I've always felt sort of like I work myself. I really respect that you did that. But what was the impetus to start the network automation forum?

Yeah. This was not on my mind eighteen months ago as an event or an organ as patient. But as, we went from late twenty two into twenty twenty three, I I became aware of Chris Grundeman's work in publishing the state of network automation survey. And I thought that's interesting.

I would love to see those results, and I would like to have a conversation with Chris. And long story short, we got to know each other a little bit. Found that we had very complimentary reviews on the state of network automation and why isn't it moving along faster than anybody thought it would? Why do you think that is?

I don't know. Why do you think that is? Mhmm. Well, let's get some smarter people around a big, big table, and let's start talking about it.

And that's what led us to, the Denver event in November twenty three. We set a goal of two hundred people. We thought if we get if we can get two hundred people to come to this, that we'll call that a success.

We ended up having over three hundred and forty high quality content, great presenters, and people that were just hungry to talk about this. You know, I think that post COVID effect and quarantine was still weighing on people. It was good to be in person, and just people had a lot to share about what they've actually done. That, other people can try in their own environments.

Yeah. Yeah. And I I appreciated the fact that it was largely practitioners. As far as I could tell, I'm sure there was a mixture And I know there were folks that were coming from the service provider side, folks from various types of enterprise, folks from various vendors as well.

I know that everybody was presented, but it really felt like it was at least what I heard in my ears and saw with my eyes very practitioner focused, very independent focused, though there were vendors and and I get all that, but it felt very, you know, engineers talking to other engineers about what's going on. So I've really appreciated that. That's that was probably my favorite part. Though it that doesn't technically tie directly to network automation because it could have been another topic.

But, sir, that that was very important to me as a former practicing engineer, But also the fact you asked that question very explicitly, why aren't we where we thought we would be? Because I remember when, like, every other blog post was talking about, Python and ansible. And, like, network automation's gonna take over the world and all these things. Like, in two thousand, maybe thirteen, two thousand fourteen, it started to get a little bit more traction.

In the, tech media and social media and things like that. And then it sort of didn't really progress exactly like you just said, which again, you know, cause you to ask the question. Why aren't we where we are? Now that's not necessarily what I wanted to talk to you in this podcast today.

Sure.

This is tough. I'm sorry to do this to you. Do you have a one or two sentence answer to that question?

Culture change has more momentum than you think or resistance to culture change.

Okay.

That's a huge learning, I think, for me coming out of the last year and kinda culminating in the the Denver event. The technology and the tools matter, don't get me wrong. It's funny. Others have asked the same question.

And in particular, Terry Slattery, who I work for Chesapeake Computer Consultants a long, long time ago. He heard one of our early podcasts and said, well, I don't think you really answered the question, and I'd like to help you answer the question. And, the resistance to culture change was essentially his number one reason, and I based said, and started thinking about that. And I'm I'm pretty convinced that that's a key contributor to the friction against.

Okay.

More widespread adoption of automation.

So it's not necessarily that there's a technical stumbling block. Like, all the devices out there are completely closed. We're not able to access them in a programmatic way. And so it's just something that we can't do technically.

Or we don't have the tools. There's no such thing as Python hasn't been invented yet. So we can't do or whatever whatever you're using. Terra form doesn't exist.

So we don't have a a means, a technical means. Those aren't the problems. You're saying that it's really largely based in in people.

Yeah. And Yeah. There are things we can tease out of that. Right? One dominant factor today, I think, is the hype and fear around AI.

You know, if you think about it, AI is, another form of automation, or at least it's very, very adjacent, right, and maybe a superset depending on how you think about it. But there is the comparisons to Sky debt. These things becoming sentient and taking a life of their own. And I don't want to be out here saying Hey, folks.

There's absolutely nothing to worry about. Of course, there are things to be concerned about and think about. And, you can drive to a bunch bigger question of AI helping us understand what it to be human. That's maybe another podcast.

But, it's a scary podcast.

Or it could be a very encouraging one, actually.

Yeah. Okay.

But hold that thought. I don't think AI is gonna come along and replace people anytime soon.

A more likely scenario is to be replaced by somebody who knows how to leverage AI tools.

Use my horribly overused analogy, you know, the movie aliens, you know, a sigourney weaver at the end and the big, loading dock exoskeleton enabling her to, defeat the, super scary hybrid human alien I think AI is more like that suit, not like the alien.

Yeah. It's what can give you superpowers. To, you know, if we go MCU on this conversation.

Yeah. Yeah. That makes sense. I've always looked at these technologies as augmenting an engineer adding to not taking away, certainly not taking away and and and not replacing.

So, you know, there may be a day when we have a commander data like AI with a positronic brain. I like to allude to Star Trek a lot, Scott, just so, you know, if you come back as a returning guest, which I hope you do, we'll be discussing more and more Star Trek as we talk more and more about AI. This is not a problem. That's cool stuff.

And I think that a lot of people look at AI and other technology in general in the light of fiction like, literally, they say, oh, well, I'm scared because I saw those movies. I'm like, you know, you know that those are, like, the that's Hollywood. Right? That's not real.

You you don't need to have that fear. Like, you gotta detach those two things. But if that's all you know, that's all you know, I mean, what else, you know, what else is popular culture have? And not everybody's out there with a computer science degree and learning and gram models and things like that.

But what I'd like to do is talk to you about something specific. Yes. We could definitely talk about what why automation isn't where it is. Or where we think it should be.

Excuse me. But I wanna talk to you about an idea that you had a concept called total network operations, And how that ties into automation, of course, how that ties into this drive towards programmability, programmable networks, maybe tied us to AI. I don't know. But, of course, how networking as an industry and as a practice is changing.

So can you give me a quick definition? What does total network operations. I mean, I know what network operations is, I think.

And, and I know what the word total means, but what does that mean combined together?

So to sum it up, as a result of work that we've done in the last year, very focused on automation, I kinda took it zoomed out and had a broader view of, okay, automation is one discipline or set of disciplines that we need in operating networks. There's a lot of change happening in the tools coming to bear for automation. The way I put this together, point one and point two. Point one Zoom out of the silo and look at the system as a whole. Really engaging systems thinking. How does automation need to work with visibility?

With multi cloud networking, with collaboration tools, all operating on network infrastructure. And that's not a complete list of all the disciplines that we need to worry about. In operations, but have assistance view point number one. Point number two, things are changing so fast, technologically.

You need to allocate and budget time to proactively investigate new tech and tools that you're gonna use from an operations perspective.

You know, not just, you know, what new version of segment routing am I gonna turn on in the network? You know, what's the new tech and tools coming online that will actually help me operate Again, all those silos now with permeable membranes as a whole system.

Okay. So then when you say network operations, and then you use the term system. Are you talking about this entire system that we call the network? And and, of course, I know there's many components changing all that, and not necessarily the entire technology stack. Am I right? Because you are saying network operations.

So, Phil, that's an excellent question, and no, I did not stage this question with you. I do believe there's a zoom out here where you can apply the same principle to viewing network and storage and compute and applications as a system as well.

So I think the principle applies as you zoom out to the entire IT tech stack. But, you know, how do you eat an elephant? One bite at a time.

Sure.

And because I'm really grounded more in networking than any of those other, you know, technology areas That's where I'm gonna try and make an impact on that. Yeah.

And, I mean, you know who I work for and what I do for a living. Work in observability, and then we take the approach again to looking at the network as a system or, you know, the system of the network, the substrate that all applications are delivered on, and and then we use the term network observability. To mean the same thing, we look at the system, holistic and the system is comprised of a lot of different components and also network adjacent components like a, like, DNS, for example. Right?

Yep. But I gotta say the reality is when you look at what's going on in the network, the delivery mechanism for the application or for the data or whatever, And ultimately, we don't care that much about it. We care about the human being or the one computer getting access to the data on the other CIDR, And because most of the applications are delivered via network, I'm looking at my computer and most of the applications that I use regularly are, I don't know, eighty percent are delivered over a network from somewhere else. So to me, so much of the stuff we care about we say we don't care about the network, but the stuff that we care about is completely reliant on the network as far as reliability, performance.

Yep. All of that.

Yep.

You can even, I've found, infer what's going on outside the bounds of the quote unquote network. By looking at the network activity simply because all of that application data is going over the network. So you see The delays are not happening here between these hops. There's no latency.

There's no jitter whatever. And, we can see that the actual time that we got our response from that server all the way on the other end, inbound into my switch was at this time. So we could see the delays certainly clearly happening on the server CIDR. What's happening that we don't know.

So you can actually infer quite a bit. And that's the whole point of observability is determining the health of an entire system by looking at its components and its and its outputs. And so when you say total network operations, is that where we're headed, where we're looking at, you I'm talking about observability, of course, shermaning the health of the system, you're talking about going even beyond that, not just understanding itself, which is kinda visibility on steroids. Right?

Yep.

But then also in its daily operations with actual, of course, understanding what's going on determining its health, and then maybe pushing config, making adjustments, in order to then again ensure performant reliable delivery mechanism. Right? So that's where the whole part and parcel comes in.

So you asked the question is that where things are going, my response would be I'm not so sure, but I think they should.

Okay.

So I'm just one individual. And I haven't been in an active operations role for some time. I'm putting these thoughts together, calling it a framework, and bringing it out here to get some feedback on. Right?

I would love to get more people involved in a conversation on Mhmm. Scott, this is crazy. This is already being done, and you you shouldn't waste your time. Okay.

Great. Thank you. That's great information. Or the other, you know, now I haven't thought of it this way before, and I'm very focused on the optical piece, an optical interaction with IP.

I'd love to help there. Or I'm a network firewall person. I would love to bring that element into the conversation. I'm looking for other people that wanna collaborate and kinda flesh this out if we see utility in it.

So when I started my career, in tech that is specifically in tech. We're gonna help desk. Then eventually, I got into more of like a systems administrator role. And I had to touch everything because my customers were SMB.

Exactly.

Yep. It was kinda like total network operations. Sure. I had to I had no choice.

Yep.

Now as I got into more sophisticated engineering, That was very difficult to do. Simply because the depth of knowledge and expertise required for each domain was such that I you you it was very difficult to just be an and all those areas. So I think what might happen is that as you have complexity and scale, you have almost no choice, but to silo your skill sets and and your different operational practices.

You're calling for a reversal of that. How does a human being do that? How do I become an expert in every single operational domain. So that way, I have this holistic view that you were talking about.

So good for clarification. Right? I'm not applying everyone needs to know everything. We all have our human human cognitive limits, right, and people are gonna specialize. Right? But there there needs to be someone And I would say there needs to be a function that intentionally understands the end to end, how those things fit together. And one of the ways I've thought about this is maybe we're we create the role of operations architect.

Somebody who's not just creating diagrams and visio or not just talking at conferences, but it has had some, you know, real hands on time in at least some of those disciplines has got to the point and has enough of a systems thinking perspective that can help organize. Okay. How do we play moneyball? Right?

I'm not a baseball guy, but I love that movie. Right? I don't need to replace person a for person b exactly I need to figure out what the mix of skills are on the team regardless of where those skills might sit. That's more of my what I'm trying to imply here.

Alright. So, again, seems like what you're trying to imply is implementing an operational framework that governs all of these domains, whether it be network, network adjacent, whatever. Correct. And so maybe there's an individual or a small team that has a more than cursory, but not necessarily expert level knowledge in these domains, but really the focus on how they interact with each other.

Correct. And so the the framework is not a person necessarily. It's, again, it's an operational mechanism at all of the what we used to keep in very disparate, distinct silos will then operate under. What what's the result?

You think that's just gonna make things better as far as, like, people always say, like, oh, breaking down silos. Like, well, who cares? Everything's working. I checked my Gmail just now.

It worked just fine. You know?

Do we really need to break down all these files? What's the result? What are you looking to do?

Make life better for people who who are who are operators.

A good big deal. Yeah.

But also bring higher availability and resiliency at lower cost. To network operators, whether it's an enterprise network, a carrier, network, a cloud network, or something else. There's almost always room for improvement. Right? And when's the last time you ran into a very well rested operations person? Okay.

It's a good point.

You know, I feel I feel pretty strong, confident in making that statement. Right? Yep. You know, I made an illusion earlier in the conversation to some things from early in my career, actually my academic career, kinda coming back around thirty plus years later.

At the Denver NAFTA event, one of our keynotes, John Willis, who comes out of the dev ops arena, primarily, but has done a lot of work applying those concepts to networking.

He's where I got the challenge to really start thinking outside of your own silo. And so the way I took that was Okay. I've been really hyper focused on automation for the last twelve to eighteen months. What else should we be stitching together here?

He actually referenced a nineteen forties fifties through the eighties, statistical process control luminary named w edwards Demming, d e m I n g. And, Doctor. Demming was someone who post World War two really worked to help improve manufacturing quality in the United States.

Didn't get a lot of traction here. It actually went to Japan.

And helped Japan accelerate and, really grow their manufacturing and in particular auto sharing businesses and kind of leapfrog quality in the US.

And it was really eye opening. This is a guy I studied as an undergrad industrial engineer.

Never ever thought about applying any of his concepts to IT disciplines, let alone networking. You know, my eyes were kind of opened, and I went and I read, Willis's book, Demning's journey to profound knowledge, which sounds hoity toity. It's actually a really good book. If you're if you're grounded in the manufacturing world, And then, follow on, he recommended called the Phoenix project.

Have you ever heard of that?

Oh, yeah. Okay. I remember reading that.

I'm in the middle of it right now.

Actually Are you?

Okay.

So There's a lot of talk of process and observing bottlenecks, work in process or whip, being a killer for any system. If you have a function or a workstation where WIP is piling up, inventory is piling up, that's the weak point. That's the constraint for the whole system.

Right. Yeah.

We can think about that in terms of lots of network operations or other operations roles where there are people in certain functions that just get consistently slammed. How do we think about flow control? I'm not talking about TCP windowing?

Right. Right. Yeah.

Across the whole system. Right? Not just with this Python script or not just with these ethernet interface errors or not just with, you know, my Slack integration not working.

Some of the things you said reminds me of when I learned about lean and when I learned about control theory from years and years ago and how that's now being that's basically the fundamental idea behind observability.

So the same idea. Again, looking at the many components that exist within a system, and how they are operating and how, you know, a component could cause the the health of the entire system to degrade. Which is what you're talking about. Right?

Exactly.

Yep. How there is a bottleneck in one area, and it could be in in in an operational context. So there's some config that's pushed or some you know, script that's running or some activity, in this sense, we're talking about the network. We're in this conversation.

We're talking about the network, but it's the thing is that the network is just so much So I I really wanna make sure we understand, as far as the audience is concerned that when we're talking about the network, think about everything involved with delivering an application to you right down to your phone or the computer that you're sitting in front of. It's a tremendous variety of devices, and devices that really that a lot of people don't associate with the network many of them are, you know, they're in the cloud. You don't own them. You can't really do much with them, except just trust that they work.

And again, going back to what I do for a living, then understanding that I always use it at tournaments, like, the difference between seeing and understanding. Yeah. I can see more stuff on a graph. But how does this particular component relate to the performance of this component.

Why is it that when I swing BGP from data center a to to center b to do some, like, you know, routine work, you know, and I wanna swing my traffic over Why is it that all my DNS resolution times, like, go in the toilet? That makes no sense. Like, what what's going on? So understanding how components fit together and where there's a bottle, like you said, ultimately, to me, will then help me determine why applications are performing the way they are, which is the whole point of this thing.

Some of those applications are mundane. Right? You're using Office three sixty five to write out your shopping list. Or, you know, like the nine one one services in your city, you know, running over an IP network So they could be mundane or the operation, literal, like, people operations in your hospital.

They could be very, very mission critical.

Yep.

So that's how I see, like, the result here. Yes. I agree with you making my life better as a as an engineer, right, or whoever's out there is still doing engineering. Not me anymore. Making life better, making your job more efficient and things like that. But ultimately, it's to make applications more performant, more reliable, and to this system to work better. Because we do rely on it.

It's life and death in a lot, absolutely.

Yeah. You might disagree with me here. I got two things I wanna say. I'm gonna play kind of the devil's advocate. I hate that term, by the way, but I'm I don't know another another term.

One, do you remember when folks were advocating for network automation, twenty fifteen, twenty fourteen? And they were saying, it's a good way to you can automate away the mundane tasks so you have more time to do more interesting work. And I was always like, what? You're gonna help me, like, do this stuff better so I can go do other work that I don't wanna do?

Like, what how what that's not a compelling reason to me. For network automation or for programmability or for what we're talking about now, total network operations. What's more compelling is making the system work better. So, yeah, I my life is better true because there are fewer faults and the end users in our case, right, our customers, quote unquote, are happier.

Sure.

That was never a real compelling thing. The second thing you also might disagree with me on is I don't know. Is networking really changing that much? You go talk to folks in operation, and they're like, yeah, I had to configure, you know, my root bridge today.

Oh, yeah. I had to configure. I had to troubleshoot this thing, and I was, you know, doing a max trace on Cisco switches. Okay.

Well, like, yeah, I set up OSPF. I left the defaults because it's easy and blah blah blah. And you talk to people, and you're like, yeah, you know, granted maybe it's an SD WAN box instead of, like, a, you know, ISR ASR.

Right?

It's not I don't know. I wonder. Is it is it really that different?

Let's talk about both of those and and and briefly on your your first point automating away mundane tasks so I can do more interesting work, not being a compelling reason. And I actually would agree with you on that. By itself, that's not enough of a reason to engage in structural wholesale change with automation, but There doesn't have to be mutual exclusion here.

Sure. Okay.

People can get benefit like we just talked about there. And also make the entire system work better.

Mhmm.

It's kind of a meta point from Demming, where in his process analysis, He very intentionally took responsibility off individual performance and put it on the design of the system and associated processes.

Now I'm not sure I'm fully on board with him because individual performance does matter.

But you really do need to ensure that your processes are designed well and functioning well. And I would say my experience in three decades of network engineering, man, it was the Wild Wild West for a long time, and it was so cool your your pedigree didn't matter. You know, piece of paper, no piece of paper. If you could do, if you could configure that AGS plus or insert favorite network device name here. If you could make it work, You're in, and you could. And on one hand, that's awesome.

On the other hand, it does kind of, inspire hero syndrome.

And really making people like to be the hero all the time. The technology has moved so fast that we haven't put as much time and effort. I think into designing process and making sure that's working well. Some some shops have done a much better job of it than others.

I don't wanna make it sound like a a blanket indictment. But there's room for more focus there. Remind me what your second point was. I'm sorry.

No. My second point was really all about how has the networking industry really changed that much. We're still passing packets. And now I have SD WAN instead of an ISR, Sure. I'm using the cloud, but it's still, like, routers at AWS and switches, isn't it?

So there have been some other interesting podcasts and other information out there addressing that point. There was a great podcast late last year from packet pushers between Greg Ferrow and, Jana, I cannot remember her last name, but they often do, I think, the heavy networking podcast together. And they ask the very same question. And I would highly recommend your listeners go check that out.

I can't remember the the exact episode right now. But the answers were basically Yes and no. Is network really changing? Greg said, no.

It's not really changing. Jonah said, yes. It absolutely is. And so let's peel that apart a little bit.

If it's not really changing that much, you know, and you got a compelling case, therefore, we're still pushing packets and frames, etcetera. What an opportunity to catch up and operate it better. If the technology isn't really moving that fast right now, let's kinda catch up on the operational side.

Good point.

And bring that in line. Mhmm. So I think there is an element of truth to that threat. The other side is Well, AI and AI enabled tools are bringing some interesting things both to changing the network infrastructure and the way we can operate on the network infrastructure.

And I would say the emphasis on GPU clusters for AI training and operations is gonna be a pretty promising growth area for folks that wanna be involved in that networking for probably the next five years or longer.

Mhmm.

Doesn't mean it's necessarily super complex.

You see the emphasis in the last year on how Broadcom is trying to do more here, play a role but other network equipment vendors too really, really focusing on that use case. That's yeah. That's interesting. But then to go back to, you know, sigourney Weaver, ripley, in aliens. Now I've got new tools that are empowered by this new technology to get over my human cognitive load limits.

Right? You know, we had a little snippet of conversation earlier. We're not asking everybody to understand everything, but these tools embedded in products exactly like yours, to do data reduction, to connect dots that a human may not see readily. And to be able to get more meaningful information out of lots and lots of data. I'll I'll pause and breathe there. Sorry for the rant.

No. Not a rant. In fact, I have a kind of a roadshow presentation called finding meaning in the data, which is just that. Using these tools, allows us to, you know, analyze data faster more efficiently and be able to find insight in ways that we just weren't able to.

Unless you got a whole team of, like, PhDs from MIT. And even then, you're still waiting for people to do the work. So being able to do that faster because of the complexity of networking, and I I was just arguing that it hasn't really changed that much. It's not that I think that networking hasn't changed.

I mean, we it's gotten complex in the amount of devices that we use, the types of devices. Sure. And now I got these ephemeral, like, Linux networking stacks inside of containers, I gotta take into account. Still TCP IP, it's still stuff.

So that's what I mean by, like, yeah, it's more complex, but it hasn't changed at the same time. Right? I got load balancers I don't own. So I gotta get, like, this dribble of flow that the provider will give me.

I gotta incorporate that in there. So you got all this information.

And now do something meaningful with it to make the system better. So finding meaning in the data. So what would you say then are the the domains or the disciplines that we're talking about within network are we just talking about basically network engineers and cloud engineers? What are the domains that you're talking about under this umbrella of total network operations?

So the the ones that come to mind for me, and I don't I don't count this a comprehensive list. Right? But just the the seven or eight buckets.

Mhmm. Sure. Work in progress.

You know, first of all, the operating the infrastructure and still using the CLI, for different flavors of network gear. That's a discipline in and of itself. Right? It's mature.

Been with us for a long time, but knowing how to, you know, set BGP next hops in iOS, in Juno, in EOS, etcetera, that matters. Right? All that chunk of stuff matters. Then there's automating it.

Right? And again, this is tip of the iceberg descriptions. Right? There's so much buried into the automation bucket.

I would put telemetry visibility and observability as a bucket.

The whole AI ops arena is another bucket that can include AI models embedded in tools and the chat tools that we're starting to, you know, more than play with today. ChatgPT, barred, etcetera, different use cases for different tools there, obviously.

Multi Cloud networking is another bucket. So to your point, are are we just talking about network out engineers and cloud engineers? Well, this kind of puts one foot on each side. Right?

And maybe I need four or five feet if I'm gonna do my network, you know, connected to cloud a and then cloud b and then cloud c. And what does the market want for workloads to be highly mobile? Right? Multi cloud networking is definitely gonna play a key role in that.

I'm really interested in the whole digital twin discussion.

Again, some of my academic background is in modeling and simulation. And I scratch my head for the first ten years in the business. Why don't we do more modeling? Of networks. And the short answer to that is because we won't believe it unless we see it work in the lab and having a model or even just VMs, run on x eighty six, or, you know, Kubernetes based, container based models. It's not the same as working on real hardware.

And the last bucket I'd call out here is is security, in particular network security. We do a really poor job of integrating you know, that into the bigger picture, and I think there's there's better room for that. Lots of other things we could talk about there. Again, what would collaboration tools have to do on top of this. What about optical and IP? There are other adjacent technologies I think that are all in scope here.

It really feels like you're talking about an operational framework, whether there's a this this one person who's got this understanding Again, not not more than cursory, but not expert level understanding of how these these domains operate loosely. And then but really focus on how do they interact, how do they fit together. And then you get the experts in a room. So it's not that you're advocating for somebody to know all these areas and be an expert in all these areas, and it's great if you can be have a good solid knowledge in all these areas. Fine. And a lot of people do. Right?

But it's really having the operational framework that combines those areas under maybe there's some hierarchy there. To, again, ultimately, then produce a better operating system, you know, having a an automation specialist, having somebody who's really familiar with the ML models and how to apply them and how to tweak an LLM to produce fewer hallucinations and things like that and how to train models in the first place and your security engineer and having I mean, So you asked the question. Is this is this anything new? I mean, it's kinda what we've been doing for a long time, sort of, isn't it? I mean, well, back in the day, I didn't we didn't have Zoom. We'd get on a like a Webex. Right?

Even before that, we got on an audio only bridge.

You got it on but I remember those two. A conference call. Right? And you'd have the twelve people responsible for each domain and then usually a project manager that knew very little about any of them, right, and you're trying to troubleshoot a problem, figure it out or whatever it was, you are sort of advocating a more formalized structured version of that. Right?

Correct. Yep. Again to my preamble. Right? I'm not saying this is necessarily anything brands bank and new.

But if there's a little little revival and a re emphasis on the concepts needed. Let's do that. Let's put a finer point on it. But combined with the fact that key technologies are are moving very fast today, maybe make this different this time around.

Maybe. I'm not I'm not a hundred percent sure in that statement, but AI is certainly a candidate for that.

Sure. And ultimately, I don't think it matters if it's something that we've done for years and years and years even prior to technology existing the way it does today. It's just you're trying to create the framework and formalize it, like you said, right, putting a finer point on it so that way we can implement it in such a way where we really are doing it. It's not just, like, ad hoc random conference calls, and we're literally winging it half the time.

And then, like, me, as the engineer, as soon as I'm, like, yeah, Ping's work. I go back on mute. I'm I'm not participating. And there's nobody that really knows that holistic view of how all of this stuff together.

We're all just sort of checking the boxes of, alright, that domain works. This domain works. This domain works. Alright.

Well, and Wells could be the problem, you know. Are we missing any domains? You know, where's the server guy? Yeah.

We got all the same people in the same in in the in the room, but it's a different approach. You are talking about a very, very explicit framework of understanding how those things fit together. That's, I think, would be a a differentiator between just people getting together and collaborating over the Right?

For sure. And I would even take that one step further.

May maybe it's a baby step further intentionally allocating time and resources to look at the new stuff that's coming down the pipe that almost Oh, I see.

Almost never happens in an operations team. You know, look, there's a lot about operations that's you know, the bottom rung on the ladder, and everything you know what flows downhill, and people in ops end up catching lots of stuff that people that slip that way down.

Yeah. Being more intentional and investing in the operations layer, I think it's a very good thing. And it's a very deming ask way of approaching it.

So today, it's safe to say twenty twenty four. Sometimes I don't like to say the date or the year because know when the the pot, but it's it's the beginning of twenty twenty four. You mentioned things kinda flowing downhill. What about within operations?

Is there a hierarchy of of domains and disciplines there? Is it the network engineer that gets the brunt of it? But, you know, the AI ML person, they're the rock star today. What what would you say is sort of the preeminent domain discipline that we should be focusing on.

Maybe to improve. I don't know. Maybe I'm asking the question the wrong way.

No. It's it is a great question. There's certainly a risk of what you're saying. Look, as people intact Right?

We know what it's like to gravitate towards the shiny objects. Right? I wanna work on the new cool thing. And Oh, yeah.

And that's not a bad impulse. Right? I think you need strong leadership to be watching out for things like that and helping balance when imbalances like that come into play. It's a horribly generic answer.

I'm sorry. But you're, you know, okay. I'm the new AI person on the ops team. And therefore, Since I'm on the most new important technology, I, I get to call all the shots.

It's like, well, no. You need somebody dampening that behavior. BeGP dampening as a as a as a as a paradigm for life is not a bad thing. So, yeah, something to be watched for for sure.

This is out of the packets and circuits. It takes real people watching these dynamics, knowing about their people, and actually caring about them and, you know, what are we trying to accomplish for our employer? Right. Both matter.

It always comes back to people. It really does. Sounds like then this fits right in line with the work that you've been doing with specifically network automation and dare I say orchestration if we can differentiate the two. We can.

Yeah. Yeah. Again, it goes right in line with that considering that to automate a system rather than just automate pushing like interface descriptions, but going into any complexity, which was probably transcending into the orchestration realm, you have no choice but to have a understanding of the entire system as a whole. Correct.

Yep.

Yeah. I mean, the way I see it is having a completely programmable infrastructure. I mean, let's go back to something very mundane. I wiggle the wrong wire, and I have a BGP adjacency go down.

Yep.

Like, from literally physically, like, play playing with a wire where, like, some twisted pair was, like, messed up. That layer one. Right? B should be adjacent as he goes down.

It kicks off an entire convergent reconvergence. Right? Some application is now being host set of a different data center. You know, all sorts of things are happening as a result of something that you can, like, maybe say is very no.

Should that happen? That's a different question. Right? There should be some s r e there that's like, alright.

That's a silly thing. We should not have all that stuff cascade from, like, a wire being touched. But in any case, that's how I look at it. You have all of these individual components operating within the system that you're trying to orchestrate and make programmable.

And and if any one of them kinda isn't well understood, it can affect the health of the entire system.

It almost certainly will. We love the cloud term blast radius. Right?

Oh, yeah.

Could be a small blast radius, but it could be much bigger than you think. Right?

Yeah. So to wrap up this conversation so far, Scott, what would you say are kind of the biggest takeaways as far as how one should start thinking about this entire network, this this system that we call the network.

I would recommend starting in the place that I started on this Think about the bigger picture. Think about the system. It's not hard. Right? It just takes stepping back a little bit. And looking at all the piece parts associated that you have in view. That would be my first recommendation.

The second recommendation is really go out of your way to make time to investigate new tools. And all the buckets are important Some of the more important ones I think are AI automation and multi cloud networking.

Some technologies are more equal than others. Apologies to George Orwell, do your homework. Right? Go back to a time when you really love learning about new stuff in tech.

Trying to revitalize that. And then the last thing I would just say is fear not. Just try. Just start small.

Start playing with chat GPT. You know, whether it's for gending up configs, sorry, even using that language dates me, or asking it, hey, what else should I be thinking about in this problem? As a tool for what else can I go do my own human centered investigating on? Don't wait until you get the perfect idea, start playing with tools now.

Those are good recommendations, and I'll just, piggyback off the last one. Something that I've, when I was a teacher, taught my students, I taught, you know, basically the equivalent of ICMD one to And I'm like, listen. Just start.

You have this idea of like, oh, but there's this CCIE out there. It's like, fine. And maybe one day you'll get there. Just start with what's in front of you.

Exactly.

With network automation, when I, I brought that to leadership at a pharmaceutical company I worked for, they kept looking at the end. And they were like, well, we don't wanna do I'm like, no. No. No.

No. How about we just start with, you know, IDF two down the hall and gathering information. So we're not even pushing fake. We're just making life a little easier there.

And just taking those first few steps, you can you can apply that to really anything in life. Right? You want a new exercise routine. Yep.

Well, you know, I'm not gonna, you know, you start to, like, look at, some body builders routine. How about how about you just start? Get in the gym for five minutes. And then you build.

And I think that's great advice, Scott, that last one. Just don't be so scared. Just start somewhere, and you can build from there. So really great talking to you today, Scott.

I really appreciate it. This is something that we can explore, certainly, especially because we've been talking about culture and people so much. That's kind of a never ending yep discussion.

For sure.

And, certainly, I'd love to talk to you more in-depth about what is going on with the network automation forum, with, future upcoming, auto cons, but more specifically your take on what's going on in the in the industry with regard to network automation and programmability.

So, for anyone that would like to reach out to you, to ask a question, maybe to disagree with you about something. How can they find you online?

LinkedIn is the best way to get ahold of me. I'm Robon. It's like putting a robe on and don't let the h fool you r o b o h n. And I'm Scott Robon on Twitter as well or or TwitterX or whatever we're calling it.

Very good.

I'm still finding good value there. I've got a pretty curated set of accounts that I watch. So I'm I'm there. I'm trying the other things too, but I haven't found anything that still provides the value that that X does.

Absolutely. Absolutely. I'm experiencing the same. And you could find me still on Twitter. I'm active there at network underscore fill.

You could search my name, Philip Jervasi on LinkedIn, also very active there, and my blog network fill dot com. I also encourage you to go over to network automation dot form to learn more about the network automation form of which Scott is co founder It's a couple future events that you could check out as well. There's auto con one in the end of May and then auto con two in November of this year. Now if you are interested in, being a guest on telemetry now or if you have an idea for an episode, I'd love to hear from you.

Reach out at telemetry now at kentic dot com. So for now, thanks very much for listening. Bye bye.

About Telemetry Now

Do you dread forgetting to use the “add” command on a trunk port? Do you grit your teeth when the coffee maker isn't working, and everyone says, “It’s the network’s fault?” Do you like to blame DNS for everything because you know deep down, in the bottom of your heart, it probably is DNS? Well, you're in the right place! Telemetry Now is the podcast for you! Tune in and let the packets wash over you as host Phil Gervasi and his expert guests talk networking, network engineering and related careers, emerging technologies, and more.
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.