I’ve interviewed many network engineering candidates over the last few years, and one theme that keeps on coming up over and over again is a total lack of understanding of what I consider to be “the basics.”
Do younger engineers really have no in depth knowledge?
The Basics
First let me say that I am conscious that there may be an implicit bias here in terms of what these “basics” would be, based on my own experience. Maybe my idea of what’s a basic fact is just out of whack with reality; you can tell me so in the comments if so. But let me share a couple of examples of what I’m talking about.
How ARP Works
If two IP hosts want to communicate with each other over Ethernet, how do they obtain the layer 2 addresses necessary to populate the ethernet frames correctly if they’re on the same IP subnet? What about if they’re on different IP subnets? To what address is the ARP sent (at layer 2), and who answers it?
Even in the last few months, I’ve seen problems caused by ARP after network failovers. If you don’t understand ARP, how can you reasonably troubleshoot an IP network?
I have run out of fingers and toes on which to count the number of times I have been told that the ARP packet is forwarded to the destination host, and the destination host responds with its MAC address. When I extend the question so that the ‘remote’ host is google.com, only about 25% of those who took this stance stop and say “Uh, wait, that can’t be right.”
ARP Table vs CAM Table
Here are some terms that are often thrown into conversation and used interchangeably as the situation merits. And I don’t mind for most of them, but ARP Table?
- ARP Table
- CAM Table
- MAC Address Table
- Switching Table
- Bridging Table
I have been assured that switches maintain ARP tables and answer on behalf of devices on the same subnet. The ARP table is apparently populated by devices broadcasting their ARP and the switch hears it and adds the IP to its lookup table. It also tells the router all about it so the router will know too. By this point in the explanation I’m tightening the noose around my neck so I lose track slightly.
Rote Memorization
One of the problems with computer-based testing is that it rewards people who can memorize definitions and facts by rote, rather than proving that they necessarily understand the technology itself.
In “Surely You’re Joking, Mr Feynman” Physicist Richard Feynman recounts teaching a Physics class at a Brazilian university (it’s long, but stick with it):
I discovered a very strange phenomenon: I could ask a question, which the students would answer immediately. But the next time I would ask the question – the same subject, and the same question, as far as I could tell – they couldn’t answer it at all! For instance, one time I was talking about polarized light, and I gave them all some strips of polaroid.
Polaroid passes only light whose electric vector is in a certain direction, so I explained how you could tell which way the light is polarized from whether the polaroid is dark or light.
We first took two strips of polaroid and rotated them until they let the most light through. From doing that we could tell that the two strips were now admitting light polarized in the same direction – what passed through one piece of polaroid could also pass through the other. But then I asked them how one could tell the absolute direction of polarization, for a single piece of polaroid.
They hadn’t any idea.
I knew this took a certain amount of ingenuity, so I gave them a hint: “Look at the light reflected from the bay outside.”
Nobody said anything.
Then I said, “Have you ever heard of Brewster’s Angle?”
“Yes, sir! Brewster’s Angle is the angle at which light reflected from a medium with an index of refraction is completely polarized.”
“And which way is the light polarized when it’s reflected?”
“The light is polarized perpendicular to the plane of reflection, sir.” Even now, I have to think about it; they knew it cold! They even knew the tangent of the angle equals the index!
I said, “Well?”Still nothing. They had just told me that light reflected from a medium with an index, such as the bay outside, was polarized; they had even told me which way it was polarized.
I said, “Look at the bay outside, through the polaroid. Now turn the polaroid.”
“Ooh, it’s polarized!” they said.After a lot of investigation, I finally figured out that the students had memorized everything, but they didn’t know what anything meant. When they heard “light that is reflected from a medium with an index,” they didn’t know that it meant a material such as water. They didn’t know that the “direction of the light” is the direction in which you see something when you’re looking at it, and so on. Everything was entirely memorized, yet nothing had been translated into meaningful words. So if I asked, “What is Brewster’s Angle?” I’m going into the computer with the right keywords. But if I say, “Look at the water,” nothing happens – they don’t have anything under “Look at the water”!
This sums up with depressing accuracy the state of so many technical interviews. I might ask somebody to describe Spanning Tree Protocol, and they will give a definition of which Radia Perlman would have been proud. Ask them what happens when I connect two links between two switches without using some kind of port aggregation protocol and they have no idea.
Exceptions
Don’t get me wrong, I do also speak to some people who really do understand these things. But I know I’m not alone in having found that the two examples I gave above are depressingly common in candidates interviewing for quite senior roles.
Why Is This?
Why is this happening? People are exiting high school with an understanding of IP that used to be the exclusive realm of the consultant. And perhaps that’s the problem; the basics aren’t being taught, they’re being learned along the way. Then when these young whippersnappers move on to college and beyond, perhaps nobody goes back and makes sure that they actually understand this “IP” thing they seem to know about already beyond typing IP addresses and default gateways into an operating system. Am I being unfair?
Maybe these candidates are a victim of abstraction. IP tends to work pretty well, and while it does, who needs to know anything about how it really works? I mean, I don’t know how my TV really works in detail – I turn it on and it works (most of the time). Similarly, when I code in Perl, I’m abstracted from the underlying machine code necessary to drive the CPUs, and I wouldn’t have the first clue where to start to optimize code to work for a particular processor (assuming I even could with Perl!). Are we all in our own way so focused on the higher level capabilities that we take the underlying functionality for granted?
Am I wrong to expect a senior network engineer to understand these things? Or should I roll with it and accept that it’s the higher level protocols that are more important? I’d love to hear your opinions.
If you don’t build the foundations the structure would eventually collapse, right?
Unfortunately as the foundations of networks become more and more abstracted from the operator or network admins mind through fancy software, GUIs etc things like how does arp work or how is a frame tagged will become lost.
Back in my CCNA days I learned what ARP was, what a subnet was and that was a fact, just the way it was, simple. It was a struggle for me to grasp in the beginning and the way I figured these things out over time was to sniff the traffic. How DOES arp actually work, how DOES a frame come in to a switch and where inside it does the tag live etc. Maybe people just don’t put in the effort to see what actually happens in the network and just assume it is the way it is written.
I agree, Keith. As a (perhaps important) side note, I don’t expect people to remember nauseating detail about frame formats and the like; that’s stuff you can look up. But in the example of 802.1q tagging that you mention, I’d at least like somebody to understand roughly how the tag is added to the frame, and what happens to untagged frames, for example.
Did I mention Frames vs Packets? Another common casualty of, I suspect, not really knowing the difference. Or maybe people just use the terms without meaning to imply something. I hate to hear about an “Ethernet Packet” though.
To your closing quesitons, I’ll give you the consultant answer: “it depends”. In the Perl example, you can be a newbie or a talented Perl hacker, but still lack the details only gained by someone who codes and compiles ‘perl’ itself (the code that actually creates the perl interpreter). However, somewhere along the line, someone has to have that knowledge otherwise Perl (perl the interpreter) would cease to exist – or at least improve.
I share your pain as I’ve interviewed some candidates recently and found the same issues. If your questioning is along the lines of the standard troubleshooting script they learned – all goes well. Throw a curve and no telling what happens. And as programming, network design and servers all become more abstracted / virtualized, the skills needed for the “basics” shift from knowing how to and what happens when you build a server and connect it to the network to how to click on the “Provision New Server” button. Those underlying mechanisms still exist but more and more people (management and even those in technology) are being lulled into believing it all just works – like your TV (most of the time).
I see in many cases today’s architect is much more big picture and the details are left to those who actually do the implementation (connection and configuration). Maybe like the difference between architect (in the old sense) and general contractor (and sub-contractors). Gone are the days of the jack-of-all-trades expert consultant – or so it would seem.
cheers.
You’re getting dangerously close to plagiarizing my next post on this topic even though it’s not posted yet 😉 Absolutely agreed – and of course there’s an element of “it depends”; no objection there!
ARP is one of those things you really learn when things break at that level. Sure, you read it in a book, maybe answer a few questions on a test, and then it starts to fade away unless you are having to troubleshoot it. Point is, you might be focusing on this as “the basics”, but quite frankly, I could see a younger Network Engineer having come up and never tripped over an ARP problem. They were more common back in the day, but now ARP bugs are extremely rare (popping up only when some new L2 redundancy technique comes along) and hardly anyone ever statically (mis-)configures them.
Is ARP really that obscure? If I ask you to find out where a server is connected based on an IP address, how do you accomplish that? To me, ARP ties together the fundamental understanding that L3 and L2 addressing don’t always match.
I use a basic understanding of ARP to troubleshoot or trace something down literally every week; but I’m also aware that may just be me 😉
Your comment about never tripping over an ARP problem does speak to the idea of things “just working” thus we never bother looking at them. I’ve been driving a car for years, and still don’t have a real understanding of how a gasoline engine really works. When it goes wrong, I call up somebody who does. Is there a danger that at some point, once all the “people who do” have got old an retired, that there are no mechanics left? I’m straying into my follow up post here, but this is clearly a concern.
In the same way, I can build a PC by purchasing parts off the shelf and putting them together into a functioning whole (most of the time). But I’m no Steve Wozniak; if it doesn’t work, I’m screwed, and the information available to me in a format I can deal with is almost non-existent, so things exist in a binary state of “Works | Doesn’t Work”.
Perhaps if the tools available to us are sufficiently good at diagnosing problems and tracing down devices on our behalf, we don’t need to understand ARP any more.
I agree…understanding ARP is a good thing. Just pointing out why something so “basic” might be missing from someone’s knowledge.
I would say the 25% who realized there was a problem with their understanding of ARP based on your google.com followup are still potential candidates for the position. It at least shows they can diagnose their own faulty knowledge.
Every industry struggles with this problem. Things get more complex the “basics” of any system because a smaller and smaller subset of the core training materials. Pick up a networking book from the 90’s and ARP was probably a whole chapter. Today, if ARP was given such prominence that book would largely be ignored.
Indeed so.
^^^ This. That’s pretty much how the whole interview is framed. I’m as interested in how a candidate can think through a problem when new information is presented as I am in whether or not they can remember facts cold. Being able to rethink your position when somebody points out a flaw is a great sign and counts positively.
A great post and valid points.
I would even go so far as to say that I share some of your frustrations.
However, the records I hear played oh so often are (in the general networking world):
*”We need to automate”
*”You need to learn to code to evolve”
*”In five years you won’t have a job as a network engineer, you’ll be writing code”.
Why is nobody saying –
*”We need to focus on simple networking” ?
Do we have ourselves and the “industry” to blame?
*grin* So I would say that:
– Automating is a very powerful thing to do, and very useful.
– Learning to code – on one level or another – certainly won’t hurt. See above.
– I’ve talked about the “5 years until you’re a coder” thing before, though somewhat tongue in cheek, reflecting thoughts being bandied around – and followed it up with what I hope was a more level headed analysis.
However in addition to those things, we can’t forget about the underlying networking. We’re so tied up in overlays and fabrics and controllers that it’s easy to forget that underneath all that, the same basic principals still apply. So I see a need for both the simple networking and the complex networking. If you try and do either without the other, you’re asking for problems.
If I didn’t know better I’d almost say you are discussing the future of careers in the industry in general with the onset of SDN, NSX, ACI and all the rest of it. Bye bye the middle tier in the new few years perhaps.
Back to your main topic, I believe quite strongly that in time this low level knowledge won’t be needed. Someone else commented that you tend to learn (or at least really understand) such things when networks break or at least bad things happen. That doesn’t really happen as much as it used to and ITIL and change control also play their part to minimise such events.
I used to know STP inside out and still do to some extent but (despite moving around a lot as a contractor) I haven’t seen or dealt with an STP related failure for well over 8 years. There’s no motivation for younger staff and/or those earlier in their careers to really get to know this stuff.
ARP Knowledge is still useful to some extent but if it gets to that point, most low level staff just call in ‘third line’/the old guy right? Back to the future; there’s a gap between the lower and higher ‘tiers’ here that will only get wider unfortunately.
“That doesn’t really happen as much as it used to and ITIL and change control also play their part to minimise such events.”
ITIL = not changing the network.
Unfortunately at the pace that some companies move, this approach is simply not feasible.
Oh I know, ITIL is the bane of my life and I’d agree it’s main ‘benefit’ is preventing change. Unfortunately I’ve not worked anywhere in over a decade that didn’t adhere to it.
As for those other companies, a) I’d love to work there and b) I bet any youngsters there are not as dumb as some.
Hello Grumpy Old Man =D, and all others,
It was quite impressive to open my eyes with the post, and I’m really thankful, but about what you mentioned in your very good post: “Do younger engineers really have no in depth knowledge?”, To answer this I’ll let you know that I don’t know your interview process at all, so I’ll answer of what I know about It.
In my interviews with young engineers I always have the pleasure to meet boys and girls that doesn’t know a thing about networking and hire them, but why should I do that?? Well, The most of them have a quite intellectual knowledge of what they want to be, therefore, they really know in depth what they are looking for, and they don’t know the words to tell about it, like: “I’m creative / entrepreneur / innovator / Leader-oriented” etc, but they try their best.
When we all hire young people the prospect shouldn’t be a 23 year old boy/girl with a 5 years experience and at least a CCIE certification, we should hire and be hired by our own potential professionally and personally ways.
I’m still considered a young engineer, and I appreciate to have a leaders, like you, that let me know what should I do or not to do, I want to be like you guys, teach young people to look under the hood of a car, it needs more than gas and water, but applause their potential and let them know about.
Angel Inglese V.
Thanks, Angel. The good news is that I don’t consider certification to the the be all and end all. In fact I warm more to a candidate that has no certifications but good experience than one who has 100 certs and no real world experience. Additionally, I have an immediate distrust of anybody who tries to manipulate their resumé to imply non-existent certifications (e.g. “CCNP (In progress)”, “CCIE Written (pending)” and stuff like that).
When I interview, I’m asking things that the candidate may or may not know. More important than having the answer memorized, though, is to see how the candidate deals with new information when they’re given it. How quickly can they take in what they’re given and process it into something useful? I also have more respect for somebody who is honest about what they know and doesn’t try to B.S. the answer confidently, like we won’t notice. Seriously, why do that? I know the answer to the questions I ask, so the candidate isn’t going to impress me by blustering through a load of nonsense in response. It might work with their current coworkers but it surely won’t work with anybody who recognizes the smell of somebody trying to fake it. And when a candidate does that, I’m worried that this is how they’d be in the workplace; feigning understanding and knowledge rather than looking for the opportunity to learn it. And I get why people do it, I really do, but it doesn’t work in their favor. I’d rather hear what you do know, and then have an honest answer about what you don’t.
None of us knows everything, and half the battle with such a huge wealth of technologies out there is to know enough to recognize that there’s something to know about in an area, even if you aren’t actually that expert. For example, it’s like recognizing that security is important in a network design even if you don’t know how how to configure a firewall yourself; but you know to include it in your design and involve the people who can.
I meant to also respond to your comment about taking new people on. It’s good to be able to take people on, mentor them, train them, and see them succeed. That’s a very different task unfortunately to hiring into a senior position where it’s necessary that the candidate walks in with a high base level of skills.
This is a culmination the foundational problem of “you’ll be out of a job as a network engineer nonsense” and the “certs are the way to get jobs” mentality.
The industry as a whole is dumbing down the future networking professionals by telling them that code is all they’ll need to know, that getting certified is all they need for a job and sensational marketing and online personalities are playing right into it. It comes down to the difference between knowing and understanding. Knowing a concept is easy. Understanding a concept is arguably harder, and of course, that is the one that is important.
We learned the fundamentals because we *had* to. Now, with things being easier to deploy, troubleshoot and provision, which is a good thing, those foundational pieces are being glossed over because they are obfuscated by faster hardware, cleaner features and more automation. These are all good things, but when they break, *someone* needs to know why.
Just because power windows are a standard in most cars now doesn’t mean that no one needs to understand how to make a window go up and down.
The right answer from a candidate that may not know an answer to something is “I’m not sure but I know where to look and the next time you ask me I will understand it inside and out”.
I would hire that person.
I know enough about networking to know when I need to call an expert (and to troubleshoot when network interfaces have been done incorrectly on a multi-homed machine) .. but I do know a lot about designing storage. The problems are similar, and I’m turning into more of a grumpy old man every day. The one thing I’ll say though is that almost everything useful I know I learned by fixing things that had broken. It was those times at 2.00AM in a dark cold datacenter when you think your career is on the line and must be working by 4.00AM or the excrement will hit the rotating device, when you suddenly realise “ohhhh, so, THAT’s why needs to be set up correctly”, and then you realise that the so-called-experts that have never had to fix the problem you’ve been banging your head against for the last 6 hours actually know way less than they pretend.
Or in other words, no expertise without experience … most of us grumpy old men had to fix stuff that was way less reliable than it is today, we designed or had to work with stuff in the absence of best practice documents, stuff that was sometimes pure genius, and often broke in interesting ways.
The reason for the XXX as code / or software defined yada-yada is to hide all that breakable complexity behind well defined interfaces, standardised deployments that have been designed so well that they just don’t break in the kinds of interesting ways that helped us learn our trade.
In the same way that we don’t have to pull moths out of the wires poking out of vacuum tubes, the stuff that needs troubleshooting skills will move elsewhere. The role of us grumpy old men may be to teach the next generation how to take a structured approach to fixing those new problems, and that when it’s 2.00AM in a cold datacenter, they’ve got someone they can call that can help them think the problem through and tell them not to worry, because someone has their back.