(#11246) Fix UTF-8 String#to_yaml exception
Without this patch, String instances with an encoding of UTF-8 cannot be
serialized to YAML. The method that coverts non-ascii 8bit printable
bytes to escape sequences uses regular expressions to match specific
bytes in the string. This works fine in Ruby 1.8, but Ruby 1.8 adds
encodings to Regexp and String instances. This is a problem because
regular expressions of one encoding may not work with Strings of another
encoding.
This patch fixes the problem by ensuring Regexp's clearly assuming a
byte sequence rather than a character sequence used in the YAML
serialization process are given an ASCII-8BIT encoded string.
A new helper method String#to_ascii8bit is added by this patch to make
Ruby 1.8 support easier. None of this matters with Ruby 1.8 so the
helper method simply returns the string itself. If the String does have
an encoding, but it's already ASCII-8BIT the helper also returns the
string itself.
In the case the String has an encoding and it's not a byte sequence with
ASCII-8BIT encoding, we make a duplicate of the string so we can force
the encoding. The duplicate is necessary so we don't change the
encoding of the original string. For extremely large strings this may
be an issue and other strategies may be necessary.