Broccoli is a new build tool. It’s
comparable to the Rails asset pipeline in scope, though it runs on Node and is
After a long slew of 0.0.x alpha releases, I just pushed out the first beta
version, Broccoli 0.1.0.
Update March 2015: This post is still up-to-date with regard to
architectural considerations, but the syntax used in the examples is
Table of Contents:
- Quick Example
- Motivation / Features
- Background / Larger Vision
- Comparison With Other Build Tools
- What’s Next
1. Quick Example
Here is a sample build definition file (
Brocfile.js), presented without
commentary just to illustrate the syntax:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
broccoli serve to watch the source files and continuously serve the
build output on localhost. Broccoli is optimized to make
broccoli serve as
fast as possible, so you should never experience rebuild pauses.
broccoli build dist to run a one-off build and place the build output in
For a longer example, see the
2. Motivation / Features
2.1. Fast Rebuilds
The most important concern when designing Broccoli was enabling fast
incremental rebuilds. Here’s why:
Let’s say you’re using Grunt to build an application written with
CoffeeScript, Sass, and a few more such compilers. As you develop, you want to
edit files and reload the browser, without having to manually rebuild each
time. So you use
grunt watch, to rebuild automatically. But as your
application grows, the build gets slower. Within a few months of development
time, your edit-reload cycle has turned into an edit-wait-10-seconds-reload
So to speed up your build, you try rebuilding only the files that have
changed. This is difficult, because sometimes one output file depends on
multiple input files. You manually configure some dependency rules, to rebuild
the right files depending on which files were modified. But Grunt was never
designed to do this well, and your custom rule set won’t reliably rebuild the
right files. Sometimes it rebuilds files when it doesn’t have to (making your
build slow). Worse, sometimes it doesn’t rebuild files when it should (making
your build unreliable).
With Broccoli, once you fire up
broccoli serve, it will figure out by itself
which files to watch, and only rebuild those that need rebuilding.
In effect, this means that rebuilds tend to be O(1) constant-time with the
number of files in your application, as you generally only rebuild one file.
I’m aiming for under 200 ms per rebuild with a typical build stack, since that
type of delay feels near-instantaneous to the human brain, though anything up
to half a second is acceptable in my book.
2.2. Chainable Plugins
Another concern was making plugins composable. Let me show you how easy it
is to compile CoffeeScript and then minify the output with Broccoli.
1 2 3 4
With Grunt, we’d have to create a temporary directory to store the
CoffeeScript output, as well as an output directory. As a result of all this
bookkeeping, Gruntfiles tend to grow rather lengthy. With Broccoli, all this
is handled automatically.
For those who are curious, let me tell you about Broccoli’s architecture.
3.1. Trees, Not Files
Broccoli’s unit of abstraction to describe sources and build products is not a
file, but rather a tree – that is, a directory with files and subdirectories.
So it’s not file-goes-in-file-goes-out, it’s tree-goes-in-tree-goes-out.
If we designed Broccoli around individual files, we’d be able to compile
CoffeeScript just fine (as it compiles 1 input file into 1 output file), but
the API would be unnatural for compilers like Sass (which needs to read more
files as it encounters
@import statements, and thus compiles n input files
into 1 output file).
On the other hand, with Broccoli’s design around trees, n:1 compilers like
Sass are no problem, while 1:1 compilers like CoffeeScript are an easily
expressible sub-case. In fact, we have a
Filter base class for such 1:1
compilers to make them very easy to implement.
3.2. Plugins Just Return New Trees
This one is slightly more subtle: At first, I had designed Broccoli with two
primitives: a “tree”, which represents a directory with files, and a chainable
“transform”, which takes an input tree and returns a new compiled tree.
This implies that transforms map trees 1:1. Surprisingly, this is not a good
abstraction for all compilers. For instance, the Sass compiler has a notion of
“load paths” that it searches when it encounters an
for imported modules. These load paths are ideally represented as a set of
As you can see, many real-world compilers actually map n trees into 1 tree.
The easiest way to support this is to let plugins deal with their input trees
themselves, thereby allowing them to take 0, 1, or n input trees.
But now that we let plugins handle their input trees, we don’t need to know
about compilers as first-class objects in Broccoli land anymore. Plugins
simply export functions that take zero or more input trees (and perhaps some
options), and return an object representing a new tree. For instance:
1 2 3 4 5
3.3. The File System Is The API
Remember that because Grunt doesn’t support chaining of plugins, we end up
having to manage temporary directories for intermediate build products in our
Grunt configurations, making them overly verbose and hard to maintain.
To avoid all this, our first intuition might be to abstract the file system
away into an in-memory API, representing trees as collections of streams. Gulp
for instance does this. I tried this in an early version of Broccoli, but it
turns out to make the code quite complicated: With streams, plugins now have
to worry about race conditions and deadlocks. Also, in addition to having a
notion of streams and paths, we need file attributes like last-modified time
and size in our API. And if we ever need the ability to re-read a file, or
seek, or memory-map, or if we need to pass an input tree to another process
we’re shelling out to, the stream API fails us and we have to write out the
entire tree to the file system first. So much complexity!
But wait. If we’re going to replicate just about every feature of the file
system, and in some cases we have to fall back to turning our in-memory
representation into an actual tree on the file system and back again, then …
why don’t we use the actual file system instead?
fs module already provides as compact an API to the file system as we
could wish for.
The only disadvantage is that we have to manage temporary directories behind
the scenes, and clean them up. But that’s easy to do in practice.
People sometimes worry that writing to disk is slower. But even if you hit the
actual disk drive (which thanks to paging is rare), the bandwidth of modern
SSDs has become so high compared to CPU speed that the overhead tends to be
3.4. Caching, Not Partial Rebuilding
When I originally tried to solve the problem of incremental rebuilds, I tried
to devise a way to check whether each existing output file is stale, so that
Broccoli could trigger the rebuild for a subset of its input files. But this
“partial rebuild” approach requires that we are able to trace which files an
output file depends on, all the way back to the source files, and it also
makes file deletion tricky. “Partial rebuilds” is the classical approach of
Make, as well as the Rails asset pipeline, Rake::Pipeline, and Brunch, but
I’ve come to believe that it’s unnecessarily complicated.
Broccoli’s approach is much simpler: Ask each plugin to cache its build output
as appropriate. When we rebuild, start with a blank slate, and re-run the
entire build process. Plugins will be able to provide most of their output
from their caches, which takes near-zero time.
Broccoli started off providing some caching primitives, but it turned out
unnecessary to have this in the core API. Now we just make sure that the
general architecture doesn’t stand in the way of caching.
For plugins that map files 1:1, like the CoffeeScript compiler, we can
use common caching code (provided by the
broccoli-filter package), leaving
the plugin code looking
Plugins that map files n:1, like Sass, need to be more careful about
invalidating their caches, so they need to provide custom caching logic. I
assume that we’ll still be able to extract some common caching logic in the
3.5. No Parallelism
If we all suffer from slow builds, should we try to parallelize builds,
compiling multiple files in parallel?
My answer is no: The reason is that parallelism makes it possible to have
race conditions in plugins, which you might not notice until deploy time.
These are the worst kinds of bugs, and avoiding parallel execution eliminates
this entire class of bugs.
On the other hand, Amdahl’s law
stops us from gaining much performance through parallelizing. For a simplified
example, say our build process takes 16 seconds in total. Let’s say 50% of it
can be parallelized, and the rest needs to run in sequence (e.g.
CoffeeScript-then-concatenate-then-UglifyJS). If we run this on a 4-core
machine, the build would take 8 seconds for the sequential part plus 8 / 4 = 2
seconds for the parallel part, still totaling 10 seconds, less than a 40%
For incremental rebuilds, which constitute the hot path that we really care
about, caching tends to eliminate most of the parallelizable parts of the
build process anyway, so we are left with little to no performance gain.
Because of that, in general I believe that parallelizing the build process is
not a good trade. In principle you could write a Broccoli plugin that performs
some work in a parallel fashion. However, Broccoli’s primitives, as well as
the helper code that I’ve published on GitHub, actively encourage
deterministic sequential code patterns.
4. Background / Larger Vision
There are two main motivators that made me tackle writing a good build tool.
The first motivator is better productivity, through fast incremental rebuilds.
I generally believe that developer productivity is largely determined by the
quality of the libraries and tools we use. The “edit file, reload browser”
cycle that we perform hundreds of times a day is probably the core feedback
loop when we program. A great way to improve our tooling is getting this
edit-reload feedback loop to be as fast as humanly possible.
The second motivator is encouraging an ecosystem of front-end packages.
I believe that Bower and the ES6 module system will help us build a great
ecosystem, but Bower by itself is useless unless you have a build tool running
on top. This is because Bower is a content-agnostic transport tool that only
dumps all your dependencies (and their dependencies, recursively) into the
file system—it’s up to you what to do with them. Broccoli aims to become the
missing build tool sitting on top.
Note that Broccoli itself is angnostic about Bower or ES6 modules—you can use
it for whatever you like. (I am aware there are other stacks, like npm +
browserify, or npm + r.js.) I will discuss all of this in more detail in a
future blog post.
5. Comparison With Other Build Tools
If you are almost convinced but also wondering how other build tools stack up
against Broccoli, let me tell you why I wrote Broccoli instead of using any of
Grunt is a task runner, and it never set out to be a build tool. If you
try to (ab)use it as a build tool, you quickly find that because it doesn’t
attempt to handle chaining (composition), you end up having to manage
temporary directories for intermediate build products yourself, adding a lot
of complexity to your Grunt configuration. It also does not support reliable
incremental rebuilds, so your rebuilds will tend to be slow and/or unreliable;
see section “Fast Rebuilds” above.
That said, Grunt’s utility as a task runner is in providing a cross-platform
way to run shell-script type functionality, such as deploying your app or
generating scaffolding. Broccoli will be able to act as a Grunt plugin in the
future, so that you can call it from your Gruntfile.
Gulp tries to solve the problem of chaining plugins,
but in my view it gets the architecture wrong: Rather than passing around
trees, it passes around sequences (= event streams) of files (= streams or
This works fine for cases where one input file maps into one output
file. But when a plugin needs to follow
import statements, and thus needs to
access input files out of order, things get complicated.
For now, plugins that follow
import statements tend to just just bypass the
build tool and read directly from the
In the future, I hear that there will be helper libraries to turn all the
streams into a (virtual) file system and pass that to the compiler. I would
claim though that all this complexity is a symptom of an impedance mismatch
between the build tool and the compiler. See “Trees, Not Files” above for more
on this. I’m also not convinced that abstracting away files behind a stream or
buffer API is helpful at all; see “The File System Is The API” above.
Brunch, like Gulp, uses a file-based (not tree-based) in-memory API (see
Like with Gulp, plugins end up falling back to bypassing the build
when they need to read more than one file.
Brunch also tries to do partial rebuilding rather than caching; see section
“Caching, Not Partial Rebuilding” above.
Rake::Pipeline is written in Ruby, which is less ubiquitous than Node in
front-end land. It tries to do partial rebuilds as well. Yehuda says it’s not
heavily maintained anymore, and that he’s betting on Broccoli.
The Rails asset pipeline uses partial rebuilds as well, and uses very
different code paths for development mode and production (precompilation)
mode, causing people to have unexpected issues when they deploy. More
importantly it’s tied to Rails as a backend.
6. What’s Next
I would like to see other people get involved in writing plugins. Wrapping
compilers is easy, but the hard and important part is getting caching and
performance right. We’ll also want to work on generalizing more caching
patterns in addition to
broccoli-filter, so that plugins
don’t suffer from excessive boilerplate.
Over the next week or two, my plan is to improve the documentation and clean
up the code base of Broccoli core and the plugins. We will also have to add a
test suite to Broccoli core, and figure out an elegant way to integration-test
Broccoli plugins against Broccoli core.
Another thing that’s missing with the existing plugins is source map support.
This is slightly complicated by performance considerations, as well as the
fact that chained plugins need to consume other plugins’ source maps and
interoperate properly, so I haven’t found the time to tackle this yet.
Broccoli will see active use in the Ember ecosystem, powering the default
stack emitted by ember-cli (an
upcoming tool similar in functionality to the
rails command). We are also
hoping to move the build process used for generating the Ember core and
ember-data distributions from Rake::Pipeline and Grunt to Broccoli.
That said, I would love to see Broccoli adopted outside the Ember community as
well. JS MVC applications written with frameworks like Angular or Backbone, as
candidates for being built by Broccoli.
I don’t currently see any major roadblocks on the path to Broccoli becoming
stable. By using it for real-world build scenarios, we should gain confidence
in its API, and I’m hoping that we can bump the version to 1.0.0 within a few
This blog post is the first comprehensive explanation of Broccoli’s
architecture, and the documentation is still somewhat sparse. I’m happy to
help you get started, and fix any bugs you encounter. Come find me on
#broccolijs on Freenode, or at
email@example.com on Google Talk. I’ll also
respond to any issues you post on GitHub.
Thanks to Jonas Nicklas, Josef Brandl, Paul Miller, Erik Bryn, Yehuda Katz,
Jeff Felchner, Chris Willard, Joe Fiorini, Luke Melia, Andrew Davey, and Alex
Matchneer for reading and critiquing drafts of this post.
Read more at the source