The Maximum Possible Size of A Meatball

A few years back, I read a ThinkWeek paper about the engineering waste in Microsoft. It presented such a picture: from bug management to test case management, from continuous integration to deployment automation, for every problem in software engineering, there are a handful different tools in Microsoft to solve the same problem. It did look awful. The author was advocating for a consolidation of engineering systems within Microsoft and I was in the same camp.

But since then, more and more I realized that Microsoft’s problem is far from being unique. It’s kind of very common in the software industry nowadays. For examples,

  • On the List of unit testing frameworks, it listed more then 40 different kinds of unit testing frameworks for JavaScript, 35 for Java, 64 for C++ and 28 for .NET;
  • On Comparison of file comparison tools, there are 24 different text file comparison tools and there are a couple others that are not on that list but I have used in the past;
  • When it comes to code review, the list is also long: Phabricator from Facebook, Gerrit from Google, Crucible, … For more, just search online.

At last, look at how many programming languages we have.

It’s inevitable that when more people use a tool, it becomes harder for that tool to meet all the needs, including some volatile/subjective measurements like “easy to use” and “good UI”, as well as meet the needs in time. When a tool gets more bloated as more people keep contributing to it, it becomes harder to learn the tool. When a common lib is used in more systems, the odds of a code change in the common lib breaking somebody becomes higher. The cost and difficulty to ensure no such regression grows worse than linearly. As the number of people/organizations who share with the same tool grows, it will get to a point where these costs, risks and difficulties outweigh the benefits of continuing to stick together. Then the only natural thing is to fall apart.

It’s just like making the meatballs. Small meatball stays as a ball for days. But as the meatball gets bigger, it becomes harder to stick together. When the size reaches a certain point, regardless how much flours you add and how hard and for how long you press the meat together, the big meatball will just fall apart as soon as you put it down. That’s the maximum possible size of a meatball.

A Parking Lot in Downtown San Franciso and the Free and Open Internet

In my experience, if a resource is 1) valuable, 2) has no fee, 3) is in limited supply and 4) open to use, inevitably it will get abused. Its available capacity and liquidity will sooner or later drop to near zero.

That's just human nature. For example, if there is a parking lot (=valuable) in downtown San Francisco (=limited supply) which is free to park (=no fee) and doesn't have any time limit such as "free to park up to 2 hours" (=open to use), inevitably it will be full all the time. People will come in the morning and leave their cars there for the whole day -- why not? It's free and no time limit. If I pull my car out to run a few errands, I'm afraid my slot will be taken immediately and I won't find a slot when I come back. Instead, I better put a folded bicycle in the trunk. I can ride the bicycle around in downtown, so that I can park my car there throughout the day for free.

This actually happened for real in my workplace, for more than once. The pattern was: we had a test cluster on which people can create virtual machines to run their testing (=valuable resource); the test cluster was already paid for by our team budget, so everyone in the team can use it for free (=no fee); the test cluster wasn’t big enough for everybody to use freely, nor could we increase the capacity (=limited supply); in the beginning, we didn’t apply any policy (=open to use).

What happened very soon was that the test cluster was full and remained full. We found that people just didn’t delete their virtual machines after their testing was done. The behavior was self-enforced because the fewer people release their virtual machines, the harder to get a new virtual machine. Everybody suffered, though we repeatedly told everybody “please use it responsibly and be considerate to other’s need”.

At last, we set up the quota and time limit: each one can have up to N virtual machines and it requires manager approval to get more than that; each virtual machine will be automatically deleted after X days unless explicitly renewed. That solved the availability and liquidity problem and no one complained about not having a “free and open” test cluster.

This is why there won’t be a “free and open” Internet without fees or quota. This is why President Obama's rules won't help, if not make things worse. In his statement today the net neutrality, he put forward a few rules including: no blocking, no throttling, no paid prioritization.

Internet bandwidth is a valuable resource and in limited supply. To avoid what happened to the parking lot in downtown San Francisco and my team’s test cluster, we have to do one way or another: either set some quota (blocking/throttling), or charge some fees (paid prioritization). Valuable resource + limited supply + no fee + open to use = no one can use.