TL;DR: OutputBuffer subclasses SafeBuffer which forces us to do runtime
checks that are probably unnecessary

I made a post about YAGNI methods hurting
you

where I said I would provide two examples, but then I got tired of writing the
article so I just did one example. Here is the other example! The previous
example demonstrated a memory leak that was introduced because the “footprint”
(the number of methods implemented on the object) was too large.

This example will show how these YAGNI methods are impacting performance of
your Rails application, and we’ll talk about how to fix it.

This problem is in ActionView::OutputBuffer which inherits from
ActiveSupport::SafeBuffer, which inherits from Ruby’s String. Let’s talk
about the behavior ActiveSupport::SafeBuffer first, then we’ll see how it
impacts to the performance of ActionView::OutputBuffer.

ActiveSupport::SafeBuffer

This class is the class that’s used to mark a string as being “html safe”. It’s
how Rails detects whether a string has been html escaped or not. In Rails, a
normal Ruby string is considered to be not “html safe”. For example:

>> x = "foo"
=> "foo"
>> x.class
=> String
>> x.html_safe?
=> false

If we call html_safe on the string, it returns an instance of
ActiveSupport::SafeBuffer that is “tagged” as html safe:

>> x = "foo"
=> "foo"
>> y = x.html_safe
=> "foo"
>> y.class
=> ActiveSupport::SafeBuffer
>> y.html_safe?
=> true

Whenever we html escape a string in Rails, it returns a safe buffer:

>> x = "<html>"
=> "<html>"
>> x.html_safe?
=> false
>> y = ERB::Util.html_escape x
=> "&lt;html&gt;"
>> y.class
=> ActiveSupport::SafeBuffer
>> y.html_safe?
=> true

Now, using the html_safe? predicate, we can easily tell the difference between
strings that have been tagged as “html safe” and strings that haven’t been tagged as “html safe” (side note: just like encodings, tagging something does not mean that it actually is correct. We can tag things as “html safe” without them actually being html safe).

We can also concatenate unsafe strings to a safe string, and it will be
automatically escaped:

>> x = ERB::Util.html_escape "<html>"
=> "&lt;html&gt;"
>> x.concat "<blink>"
=> "&lt;html&gt;&lt;blink&gt;"
>> x.html_safe?
=> true

Finally, we can concatenate safe strings and they will not be double escaped:

>> x = ERB::Util.html_escape "<html>"
=> "&lt;html&gt;"
>> x.concat ERB::Util.html_escape "<blink>"
=> "&lt;html&gt;&lt;blink&gt;"
>> x.html_safe?
=> true

Infecting a SafeBuffer

So far, the html_safe? predicate is a 1:1 relationship with the class.
Meaning that if the class of the string is ActiveSupport::SafeBuffer, then
html_safe? would return true, and if the class was String, html_safe?
would return false.

Unfortunately it is not a 1:1 relationship. We can mutate a SafeBuffer, making
it unsafe again. For example:

>> x = ERB::Util.html_escape "<html>"
=> "&lt;html&gt;"
>> dangerous = "&gt;<script>"
=> "&gt;<script>"
>> x.gsub!(/&gt;/, dangerous)
=> "&lt;html&gt;<script>"
>> x
=> "&lt;html&gt;<script>"
>> x.class
=> ActiveSupport::SafeBuffer
>> x.html_safe?
=> false

You can see that the string in dangerous has been embedded in to the
SafeBuffer without being escaped. The class of x is still SafeBuffer, but
calling html_safe? will return false. As you can see, the return value of
html_safe? is not a 1:1 relationship with the class.

Concatenating to an infected SafeBuffer

Our infected SafeBuffer still supports concatenation:

>> x
=> "&lt;html&gt;<script>"
>> x.class
=> ActiveSupport::SafeBuffer
>> x.html_safe?
=> false
>> x.concat "<blink>"
=> "&lt;html&gt;<script><blink>"

But you can see that the “<blink>” string was not escaped. This means that the
concat behavior on a SafeBuffer changes depending on the value of html_safe?. If you look at the implementation of concat,
along with it’s helper method,
you can indeed see that this is true.

What is OutputBuffer?

OutputBuffer is a buffer that is fed to Rack and then output to the client
over the wire. Templates instantiate an OutputBuffer and concatenate computed
output to the buffer.

Impact of SafeBuffer on OutputBuffer

How does SafeBuffer impact OutputBuffer? Let’s take a look at a compiled ERB
template to find out. Here is an ERB template after it’s been compiled to
Ruby:

@output_buffer = output_buffer || ActionView::OutputBuffer.new
@output_buffer.safe_append='      <tr>
        <td>'.freeze
@output_buffer.append=( book.name )

The output isn’t pretty, but you can see that we essentially call two methods on
the output buffer: append=, and safe_append= (why we have an = is a
mystery for the ages). append= is used when there are dynamic values like
<%= link_to ... %>, and safe_append= is used for string literals in your
template.

Let’s look at the implementation of
safe_append=
:

def safe_concat(value)
  return self if value.nil?
  super(value.to_s)
end
alias :safe_append= :safe_concat

First, notice that value is never nil. The ERB compiler ensures that fact, so the
value.nil? check is useless. However, this blurrrggghhh post is about how
YAGNI methods are hurting us. We do need to concatenate, so you are gonna
need this.

OutputBuffer is a subclass of SafeBuffer, and safe_concat supers in to
SafeBuffer. Let’s look at the implementation of safe_concat on
SafeBuffer
:

def safe_concat(value)
  raise SafeConcatError unless html_safe?
  original_concat(value)
end

safe_concat raises an exception if the current object is not html_safe?.
Every time our ERB template concatenates a string, it checks whether or not the
current object is “html safe”.

As we saw earlier, the only way to make an instance of a SafeBuffer not
html safe is by calling an unsafe method on that
instance
.

YAGNI method pain

Every time we concatenate to an OutputBuffer object, we’re forced to check a
flag. This flag is directly related to people calling gsub! on the
OutputBuffer. This bothers me because, as you can imagine, we concatenate on
to the OutputBuffer extremely frequently.

I’m betting that it’s extremely rare for someone to call gsub! on an
OutputBuffer. I’m also willing to bet that people calling gsub! on an
OutputBuffer would be willing to call to_s on it before doing their
mutation, so why do we “support” this?

Our OutputBuffer is punishing most people for a feature that is extremely
rare. This is thanks to the features we got “for free” from our superclass.

Secondly, if you look in the same file that defines OutputBuffer, you’ll see
a class called
StreamingBuffer
.
This class is meant to be a replacement for OutputBuffer, but it streams to
the client. Notice that this class does not inherit from SafeBuffer.
Further evidence that gsub! on OutputBuffer is a YAGNI method.

How do we fix this?

I think we can fix this by disconnecting OutputBuffer from it’s superclass. I
suspect that the methods people actually use on OutputBuffer are extremely
few. I’d also say that this class should be completely private. In other
words, we (the Rails team) should design our API such that people never actually
touch instances of the OutputBuffer.

If we disconnect OutputBuffer from it’s superclass, I think we can even do
some tricks with the ERB compiler to attain faster output.

Read more at the source