Small performance improvements for splitting
Compiling the pattern gives us a ~20% performance boost for large split
operations (literal with CRLF).
Not using split_binary seems to perform more consistently (and is
recommended by the erlang performance guide).
I attempted to use binary:split again, but only managed to create a more
complicated, and barely faster solution. Split itself is super fast (factor of 10), but
we subsequently have to reattach CRLF's, which eats up all of the wins
(It's not clear why binary:split can't split without loosing the
separator, but that's seems like it is).