As a relatively young person in the technology industry who is interested in low-level software and designing hardware, but spends most of my day job working on backend and distributed systems, I frequently find myself lamenting the fact that many folks in my generation mostly grew up in a world where compute is thought of more in terms of an infinite abstract resource, rather than a capability of a physical machine. That is not to say that the latter is optimal to the former; in almost all measures (e.g. scalability, productivity, etc.) it is not. However, while the rise of cloud providers has made it drastically easier to build software products, I believe many of my peers have been robbed of the joys and pains of interacting directly with tangible hardware, and those of us that haven’t have had to fight tooth and nail for the experience.
This isn’t an old man yelling at the cloud argument. In fact, I have invested most of my short career building projects and products that make consuming cloud infrastructure easier. I am not proposing that more companies should rack their own servers; choosing to play in the cloud sandbox is the right call for 9 out of 10 organizations. However, I am arguing that, for those of us who grew up in the sandbox (“sandbox programmers”), the decision to continue to play in it is a much different calculation than it is for folks who experienced a world where the sandbox was not an option. Playing in the sandbox is a conscious decision for the latter group, fueled by the scars from a world outside of it. For the former group, playing in the sandbox is the status quo, and the decision to remain is driven by the comfort it offers and the fear of a complex unknown world that exists outside of it. Regardless of whether playing in the sandbox is the right decision, new generations of programmers are making the decision with, at best, second-hand information.
This isn’t to say that sandbox programmers are without any agency in this situation. In many ways, experimenting with hardware and building “close to the metal” is easier than ever with countless options for development boards and full computers, such as the Raspberry Pi, readily available for the motivated programmer. This also isn’t to say that a programmer needs to understand physical hardware at a fundamental level to be capable and effective in their role. What I am most concerned with is the idea that not consuming cloud provider managed services is a “bad engineering decision”.
Avoiding reinventing the wheel, and fighting against not invented here syndrome are worthy causes for an engineering organization, but they can be taken to their logical extreme when they start to suppress curiosity and innovation. There is a difference in saying “I know what building this entails, and using someone else’s implementation is a better use of our resources” and “I don’t know what building this entails, but avoiding needing to understand it by consuming someone else’s implementation is a better use of our resources”. Both of these approaches are frequently viewed as pragmatic decisions, but I would argue the first is a much different situation than the second.
The primary danger with the second is that there is a cost to the improved efficiency that is not present with the first: your understanding. Whether consuming someone else’s implementation is the right decision or not, you have put yourself in a position where the things you build on top of the chosen abstraction are going to be designed with less context. The sandbox is not the city you choose to live in, it is your entire universe.
So what is a sandbox programmer to do? One option is investing your nights and weekends learning about hardware, and designing and building your own. This is the route I have taken, but only because I am a privileged individual in that I have access to resources and a lack of constraints on my time relative to the vast majority of people in the world. It is unreasonable to make this path an expectation. Another option is to make the decision to go outside the sandbox, even if you believe that playing in the sandbox is the right decision. Any pragmatic engineer will know that this is infeasible if you have been tasked with making the optimal decision for the organization. Choosing to introduce unnecessary complexity and delay delivery is essentially equivalent to malfeasance.
This is an encounter with “weaponized pragmatism”. A good tenet of engineering has now backed you into a corner where you must choose between building your own understanding and performing the stated duties of your job. Choose the former and you are violating an agreed upon value of your engineering organization. Choose the latter and you forever stay in the sandbox, only ever breaking free in the event of an exceedingly rare scenario that strictly requires it.
An Inefficiency Budget
As mentioned before, it should not necessarily be a requirement of engineers to understand hardware below the cloud abstraction. Some extremely talented folks have no interest in diving to lower layers and instead prefer to continue to hone their skills in the sandbox where they operate. This is an admirable attribute of a progammer.
However, for those who deeply desire to understand the lowest level primitives, the organization should view that exploration as inefficient, but not useless overhead. That is to say, if you view the wisdom of programmers who experienced the world outside the sandbox as valuable, you should also view the gaining of that wisdom as a worthwhile endeavor for folks that do not already have it. As with most things in a well-functioning organization, this outlook can be established by clear communication between leadership and individual contributors.
A good model for establishing a so-called “inefficiency budget” is by mirroring the common pattern of factoring in time to address existing technical debt into any feature implementation. The portion of any effort that is dedicated to technical debt is dependant on a number of variables, including but not limited to:
- How urgently does the business require the feature that is being implemented?
- How much technical debt currently exists?
- How severe is the technical debt?
- How will addressing the technical debt enable future work?
Variables that impact how much time is dedicated to valuable, but potentially inefficient exploration outside of a sandbox for a given effort could include:
- How likely is it that this exploration will be applicable to the work done in this effort?
- How likely is it that this exploration will be applicable to work done on future efforts?
- How can this exploration result in some tangible output?
- How much joy will this bring the engineer?
The first two are straightforward, but the last two are a bit more ambiguous. One way that a seemingly inefficient exploration can be translated into value for the business is by producing content around it. The knowledge gained can be shared with a larger community, potentially impacting the organization’s technical reputation and ability to attract talent. Similarly, these explorations can impact retention by giving engineers the opportunity to exercise their creativity and generally find joy in their work. In the current market for talent, both of these absolutely translate into value for an organization.
Furthermore, by dedicating time to the exploration, you may find that the original answers to the first two questions were incorrect assumptions. Much of engineering is taking ideas from one area and applying them to another. In order to encounter those ideas, you must be willing to venture down non-obvious paths.
Though for my generation the sandbox may be cloud providers, the predicament described in this essay is not new. As technology continues to advance, each generation is standing on the abstractions of those who preceded them and is frequently pressured to accept those abstractions. The vast majority of the time those abstractions are useful and worthy of acceptance, but that does not mean we should discount the value of exploring the world outside of the sandbox they give us.
One of my favorite conference talks is Timothy Roscoe’s OSDI ‘21 keynote “It’s Time for Operating Systems to Rediscover Hardware”. In it, he makes the case that Operating Systems research has stagnated, while the hardware beneath has been changing rapidly. Because the Linux abstraction became ubiquitous, we forgot to continue to evaluate whether its design still made sense as the world around it began to look different. In many ways “Just use Linux” is a form of weaponized pragmatism: do something else and risk being ridiculed if it fails.
The equation looks different in academia vs. in industry, but the concepts are still applicable. I’m still working out what it looks like to encourage this “inefficient” intellectual curiosity in the context of my own career, but I hope that I and others have the courage to advocate for risky exploration, while also being creative about how it can add value for the organization.
In other words, I hope that we look beyond our comfortable sandbox, even if we ultimately choose to remain within it.