(PUP-3558) Fix offset error of '[' lexing when multibyte chars are used
If there were multibyte characters anywhere before a '[' and the '['
should have been lexed as a LBRACK the difference in byte position and
char position could result in looking at the wrong position for the
character preceding the '['. If that character (wrong position) was
a whitespace, the lexer would emit a LISTSTART token instead of LBRACK.
The correct way to translate byte offset to char offset is to
use the locator.char_offset(byte_offset) method (since the method varies
depending on Ruby version).
If jupiter aligned with mars the result could be a syntax error on the
'[' because a literal list was not accepted in that position, or worse,
an access expression (x[y]), could be broken apart into two expressions;
a value followed by a literal list. The latter could go unnoticed except
for not producing the wanted answer.
This commit changes the method to get the char offset to use the locator
instead of using the byte offset as a char index when looking at the
preceding character of a '['.