Hack, HHVM and avoiding the Second-system effect

Marton Trencseni - Sat 14 May 2016 - Books

Introduction

I read Hack & HHVM—Programming Productivity without Breaking Things on my first vacation after I started working at Facebook and thus became a semi-regular Hack/HHVM user. I highly recommend reading (parts of) it. But not to learn Hack/PHP, which is irrelevant to most people. Instead, it’s to learn about how Facebook improved it’s www codebase and performance without rewriting the old PHP code in one big effort, and thus avoided the famous Second-system effect.

Hack & HHVM book

Second-system effect

The second system effect was first described by Fred Brooks in The Mythical Man Month, based on his experiences managing operating system software development at IBM in the early 1960s:

The second-system effect proposes that, when an architect designs a second system, it is the most dangerous system they will ever design, because they will tend to incorporate all of the additions they originally did not add to the first system due to inherent time constraints. Thus, when embarking on a second system, an engineer should be mindful that they are susceptible to over-engineering it.

Let me offer a more modern description: version 1.0 of the product/app/software is successful. Over time the programmers realize that, knowing what they know now, they could do a much better job. Meanwhile, the technology landscape changes, it'd be nice to take advantage of the shiny new architectures, languages, frameworks available. So the team embarks on the quest to ship 2.0―a rewrite. Inevitably, even good teams will over-engineer, and the result will be a technological and project management mess. 2.0 projects like this miss their original ship dates by several years. Once it does ship, it’s buggy and slow, because unlike 1.0 it has no fine-tuning, since it hasn't seen the light of real-world usage yet. So several more years go by until 2.0 is also fine-tuned. At this point the new set of programmers―the cohort who joined after 1.0―can repeat the Second-system effect with 3.0, which for them will be the new 2.0. Rinse, repeat.

I think within this book is a nice little lesson about how to avoid the Second-system effect. The book doesn't actually mention the Second-system effect, and I'm not implying anything about the history of the main www codebase at Facebook. I'm not saying that Facebook specifically did this to avoid the Second-system effect. It's just a lesson that I think can be extracted from the design decisions explained in the book.

PHP, Hack, HHVM

The story here is that Facebook started out as a PHP codebase. Over time the product became very successful, which meant that it was:

  • very large (1M+ LOC)
  • serving a large number of users
  • being worked on by a large number of engineers

So there was a desire to:

  • speed it up so it can serve more users per node
  • make it easier for engineers to work on the code

I think that for many programmers (including yours truly) the instinctive reaction would have been to say "PHP sucks, it's slow and unsafe, let's rewrite the www codebase in a real programming language like Java and run on the JVM". What's interesting is that Facebook did not do this; Facebook did not discard PHP.

Instead, Facebook decided to improve the layer below the application code to improve overall performance, and write new code in a way which takes advantage of the features of the improved layer (and very slowly deprecate old code). The "layer" here is actually many things:

  1. Hack, a language like PHP, but much nicer
  2. a static type-checker for Hack
  3. HHVM, a runtime for Hack (and also regular PHP)

Two notes are in order here:

  • Historically, there was something called HPHPc before HHVM. It was a PHP-to-C++ compiler, but it’s no longer being used at Facebook.
  • Hack and HHVM did not come about as a result of a committee sitting down, identifying the problem, scoping out the solutions, and picking one. They originated (both HPHPc and the Hack language) from Hackathons, an integral part of Facebook engineering culture, where individual engineers were attacking problems they thought are promising and important.

My favorite features of Hack/HHVM:

  • very fast
  • 100% interoperability with regular PHP code (eg. existing code)
  • types (can also run regular PHP code in untyped mode)
  • the type-checker is very fast, millisecond response time even is very large codebases since it maintains state
  • generics, lambdas, etc.
  • async/await keyword for cooperative multitasking: this is very cool and worth reading up on. Essentially it's language/runtime level support for (single-threaded) event-driven architecture (epoll, kqueue, Completion Ports), so you don't have to explicitly manage the state like we did in the plain old C++ ScalienDB codebase.
  • XHP: the way to do www rendering safely (in the xss sense) in Hack, with language level support for XHTML and custom modules (eg. a Comments box)

Lesson learned

So the interesting lesson here is that a possible way out of the Second-system effect is to start improving the environment (language, runtime, frameworks, etc.) of the main codebase instead of rewriting the main codebase. I certainly don't think this is the solution, in many cases it cannot be applied, but it's something to keep in mind as a design pattern. Some of the challenges of this approach:

  • You need a couple of really smart people who can design and implement a new language that's backwards compatible with existing code.
  • You need to put sustained effort into it afterwards, keeping it mostly compatible with the standard version of the language.
  • Every new engineer will need time to ramp up using the new language.

Another interesting aspect of this is the investment needed. Rewriting the whole application codebase is an all-in project, with all (or much) of the engineering team working on it. I assert that changing out the layers below and around it can be accomplished by a smaller, focused team, iteratively. It's a smaller bet. Writing HHVM was certainly a smaller effort than rewriting all of Facebook in Java would have been! Having said that, an organizational/management note: I do think you need a fairly large group of people to generate enough ideas (and Hackathon projects) so that some really good and impactful ones come out of it.

Conclusion

I will conclude this post with my personal impressions: working with Hack/HHVM is very pleasant. The type checker holds your hand all the way, so it feels much nicer/safer than eg. writing Python. The syntax is a bit unfortunate in places, but overall it’s a non-issue for me. I'd consider using Hack/HHVM for personal projects or a startup. It's completely open source, so anybody can use it for their projects.

Thanks to Zsolt Dollenstein for reviewing this blog post and giving valuable suggestions.

Links: