Algorithms Interviews: Theory vs. Practice

When I ask people at top tech companies why algorithmic quizzes are a staple of their hiring process, the most common response is that their scale is so massive that they can't afford to have someone accidentally write an inefficient algorithm, like one with a time complexity of O(n^2), which could bring down the entire site. What I find amusing about this is that, despite being able to solve phone-screen-level algorithmic problems on the job, I've consistently failed algorithms interviews. In fact, I've failed more than half of the roughly 40 "real" software interviews I've had, with maybe one or two exceptions.

To illustrate what I mean by "phone-screen-level algorithmic problems," let's consider a few examples. At one big company I worked for, a team implemented a resizable array that was so inefficient it caused significant performance issues. The implementation added a constant number of elements and copied the old array to a newly allocated, slightly larger array on each resize. This approach resulted in linear time resizing instead of amortized constant time resizing, a classic example often used to demonstrate amortized analysis.

In contrast, typical phone screens I've received usually involve an "easy" coding or algorithmic question, possibly preceded by a "very easy" warm-up question. Alternatively, I might be presented with a series of "very easy" coding or algorithmic questions, or even trivia, although the latter is less common for generalist roles. The resizable array implementation problem is considered "very easy" and might be used as a warm-up question or bundled with other easy questions. Yet, this inefficient implementation was responsible for roughly 1% of all garbage collection pressure across the company, as well as a significant fraction of CPU usage.

Fixing this issue made my employer more money annually than I've made in my lifetime. Another example from the same company involved converting a pair of long values to byte arrays in a core library. Someone had written a hash function that took a byte array as input, modified it to take two inputs, and left the hash function interface as (byte[], byte[]). To call this function on two longs, they used a utility library function that allocated a byte[] and stuffed a long into it, reversing the endianness of the long in the process.

I fixed this by changing the hash function interface to take a pair of longs instead of byte arrays and having the hash function perform the endianness reversal. Removing these unnecessary allocations made my employer more money annually than I've made in my lifetime. Finding a constant factor speedup isn't technically an algorithmic problem, but it's a common follow-up question in interviews. The answer often involves a simple optimization that results in a constant factor improvement.

For instance, I've been asked in interviews to store IDs as ints, but given context that the IDs are densely packed, I can store them as a bitfield instead. The difference between this bitfield interview question and the real-world example is that the existing solution is far from the expected answer, making it unlikely to be asked in an interview. Instead, I would have likely failed the interview at that point.

Another example from a different company involved configuring a search index used in Bing. The full context is complex, but essentially, there's a set of bloom filters that need to be configured. One approach is to write a black-box optimization function using gradient descent, but this always resulted in suboptimal configurations. I created a more optimized solution by observing that the fundamental operation is equivalent to multiplying probabilities together. By iterating over the space of possible configurations, I could pick out the best set of configurations.

These examples have several common properties: they could be phrased as interview questions, most people on the relevant team would get the right answer in an interview setting, the cost savings from fixing the issue are worth more annually than my lifetime earnings, and the issue persisted for a long time, suggesting it wouldn't have been discovered otherwise.

At the start of this post, we noted that people at big tech companies claim they need to conduct algorithms interviews to avoid inefficiencies at scale. However, my experience suggests that these examples are common at every company I've worked for that uses algorithms interviews. Trying to get people to solve algorithms problems on the job by asking algorithms questions in interviews doesn't seem to work.

One reason is that big companies incentivize developers to avoid deploying algorithmic reasoning to make money. Of the three solutions I mentioned, two are in production, and one isn't. This is about my normal hit rate when I present a diff to a random team without persistently following up. If I go to a random team, it's likely that efficiency is not a team objective or a company objective. The company has likely spent effort incentivizing teams to hit their objectives, making it pointless to have objectives otherwise.

Accepting my diff would require teams to test, integrate, and deploy the change, creating risk. I'm asking teams to do work and take on risk to do something that's worthless to them. Despite incentives, people will usually take the diff, but they're unlikely to spend their own time trying to find efficiency improvements. Their normal work time will be spent on things aligned with the team's objectives.

Hypothetically, if a company didn't try to ensure developers could pass algorithms quizzes but did incentivize developers to use efficient algorithms, I don't think any of the examples above could have survived undiscovered for years or remained unfixed. A developer working at a company where people profile their code would likely look at the hottest items in the profile for the most computationally intensive library at the company.

The "trick" isn't algorithms wizardry; it's just looking at the data. Incentives can fix this. The third example is less inevitable, as there isn't a standard tool to identify the problem. It would be easy to spin the result as wizardry, but the reality is that it involves applying high school math.

I worked at a company that used a different strategy: they didn't ask algorithms questions in interviews, but they did incentivize things that were good for the company. During my time there, I found only one example that nearly met the criteria for the examples above. I think the main reason is that people viewed making the company better as their job, so straightforward, high-value fixes tended not to exist. In rare instances where they did, enough people were trying to do the right thing for the company that someone else would likely fix the issue before I encountered it.

The algorithms and coding part of the company's interview was easier than the phone screen at major tech companies, and they didn't do a system design interview. For a while, they tried an algorithmic onsite interview question, but they stopped asking it because every new grad failed. They simply weren't prestigious enough to attract candidates who could easily answer those questions.

In contemporary discussions on interviews, what they did is often called "lowering the bar." However, I'm unsure why they should care about how high a bar someone can jump over when little of the job involves jumping over bars. When measured on actual productivity, that was the most productive company I've worked for. I believe the reasons for this are cultural and complex, but I think it helped that they didn't filter out good candidates with algorithms quizzes and assumed people could learn on the job if they had a culture of doing the right thing.

If other companies want people to solve interview-level algorithms problems on the job, perhaps they could try incentivizing people to solve algorithms problems when relevant. This could be done in addition to or instead of filtering for people who can whiteboard algorithms problems.

Appendix: How Did We Get Here?

In the past, interviews often involved "trivia" questions. Modern versions of these might look like questions about specific technical topics. This practice was common when Microsoft was the biggest game in town, and people who wanted to copy a successful company would likely copy Microsoft. The most widely read programming blogger at the time, Joel Spolsky, advocated for adopting certain software practices because Microsoft was doing them, and you couldn't compete without adopting the same practices.

At the time, popular lore was that Microsoft asked people questions like brainteasers, such as how to escape from a blender if you were half an inch tall or why manhole covers are round. Since I was interviewing during this era, I got asked plenty of trivia questions and brainteasers. Some other questions that were popular at the time included Fermi problems and behavioral interviews.

When I asked people why they thought brainteasers or Fermi questions were good, the convenient rationalization was usually that they tell you if a candidate can really think, unlike trivia questions, which only tell you if someone has memorized some facts. Looking back, people now realize that this wasn't effective, and cargo culting Microsoft's every decision won't make you as successful as Microsoft.

For interviewees, the process with brainteasers was basically the same as it is now with algorithms questions. You'd review a book like "How Would You Move Mount Fuji?" before interviews instead of "Cracking the Coding Interview" to pick up a bunch of brainteaser knowledge that you'll never use on the job instead of algorithms knowledge you'll never use on the job.

Back then, interviewers would learn about questions from interview prep books and ask them to candidates who learned the answers from the same books. When I talk to people who are ten years younger than me, they think this is ridiculous – those questions obviously have nothing to do with the job, and being able to answer them well is more strongly correlated with having done interview prep than being competent at the job.

Hillel Wayne has discussed how people come up with interview questions today, and it doesn't seem all that different. Outside of groups that are testing for specialized knowledge, it's not clear that the process has changed much.

At this point, we've gone through a few decades of programming interview fads, each one of which looks ridiculous in retrospect. Either we've finally found the real secret to interviewing effectively, or we're in the middle of another fad that will seem equally ridiculous to people looking back a decade or two from now.

Without knowing anything about the effectiveness of interviews, at a meta level, since the way people get interview techniques is the same – cribbing the high-level technique from the most prestigious company around – I think it would be pretty surprising if this wasn't a fad. I would be less surprised to discover that current techniques were not a fad if people were doing or referring to empirical research or had independently discovered what works.

When I asked people how they checked the effectiveness of their interview process and how they tried to reduce bias, the answers I got were often unhelpful or vague. The most common responses were that they didn't do it, didn't know if their process was effective, or simply knew that it worked without being able to explain why. Some people mentioned that someone had looked into it or done a study, but no one could provide concrete details about the methodology or results.

Appendix: Training

When trying to figure out why seven-, eight-, or nine-figure per year interview-level algorithms bugs are lying around waiting to be fixed, there isn't a single "root cause" to point to. Instead, there's a kind of hedgehog defense of misaligned incentives. Another part of this is that training is woefully underappreciated.

We've discussed that, at all but one company I've worked for, there are incentive systems in place that cause developers to feel like they shouldn't spend time looking at efficiency gains, even when a simple calculation shows that there are tens or hundreds of millions of dollars in waste that could easily be fixed. And then, because this isn't incentivized, developers tend not to have experience doing this kind of thing, making it unfamiliar, which makes it feel harder than it is.

So even when a day of work could return $1 million per year in savings or profit, people don't realize that it's only a day of work and could be done with only a small compromise to velocity. One way to solve this latter problem is with training, but that's even harder to get credit for than efficiency gains that aren't in your objectives.

For example, I once wrote a moderate-length tutorial on how to find various inefficiencies, and I've had a couple of people tell me that they were able to debug and fix an issue on their own after reading it. I've also heard secondhand that a few other people were able to increase the efficiency of their service. I'd be surprised if I've heard about even 10% of the cases where this tutorial helped someone, so I'd guess that it's helped tens of engineers, and possibly quite a few more.

If I'd spent a week doing "real" work instead of writing a tutorial, I'd have something concrete, with quantifiable value, that I could easily put into a promo packet or performance review. Instead, I have this nebulous thing that, at best, counts as a bit of "extra credit." I'm not complaining about this in particular – this is exactly the outcome I expected. But, on average, companies get what they incentivize.

If they expect training to come from developers but don't value it as much as they value dev work, then there's going to be a shortage of training. I believe you can also see training under-incentivized in public educational materials due to the relative difficulty of monetizing education and training.

Appendix: Misaligned Incentive Hedgehog Defense, Part 3

Of the three examples above, I found one on a team where it was clearly worth zero to me to do anything that was actually valuable to the company, and the other two on a team where it was valuable to me to do things that were good for the company, regardless of what they were. In my experience, this is very unusual for a team at a big company, but even on that team, incentive alignment was still quite poor.

After getting a promotion and a raise, I computed the ratio of the amount of money my changes made the company versus my raise and found that my raise was 0.03% of the money that I made the company, only counting easily quantifiable and totally indisputable impact to the bottom line. The vast majority of my work was related to tooling that had a difficult-to-quantify value that I suspect was actually larger than the value of the quantifiable impact.

So, I probably received well under 0.01% of the marginal value I was producing. And that's really an overestimate of how much I was incentivized to do the work – at the margin, I strongly suspect that anything I did was worth zero to me. After the first $10 million or $20 million per year, there's basically no difference in terms of performance reviews, promotions, raises, etc.

Because there was no upside to doing more work and there's some downside, the marginal return to me of doing more than "enough" work was probably negative. Some companies will give very large out-of-band bonuses to people regularly, but that work wasn't for a company that does a lot of that, so there's nothing the company could do to indicate that it valued additional work once someone did "enough" work to get the best possible rating on a performance review.

From a mechanism design point of view, the company was basically asking employees to stop working once they did "enough" work for the year. This also happened in another way: managers were given a team-wide budget for raises that was mainly a function of headcount, which was then doled out to team members in a zero-sum way. Unfortunately for each team member, the team pretty much only had productive engineers, meaning that no one was going to do particularly well in the zero-sum raise game.

The team had very low turnover because people liked working with good coworkers, but the company was applying one of the biggest levers it has – compensation – to try to get people to leave the team and join less effective teams. Because this is a common setup, I've heard of managers at multiple companies who try to retain people who are harmless but ineffective to work around this problem.

If you were to ask someone abstractly if the company wants to hire and retain people who are ineffective, I suspect they'd tell you no. But insofar as a company can be said to want anything, it wants what it incentivizes.