The Great Newline Schism

LOVE this exploration of fiddly implementation gotchas. Thanks!

I happen to have both Word '08 and Pages '09 open (backward compatibility vs. usability), so I decided to see how they responded to all the nominally invisible Unicode ‘separator’ glyphs.

Pages doesn’t respond at all to the inscrutable ‘Information Separator’ glyphs, but it does interpret ‘Line Separator’ and ‘Paragraph Separator’ as whatever iWork apps use internally as LF and New Paragraph. (Too lazy to find that out at the moment.)

Word 2008, on the other hand, responds to two of the Information Separators with box or hyphen glyphs, but ignores Line Separator, Paragraph Separator & the other two Information Separator glyphs.

BTW…

See http://unicode.org/standard/reports/tr13/tr13-5.html for converting between Unicode and non-Unicode line break schemes.

See http://bugs.python.org/msg97407 for discussion of Information Separators (before ignoring their insufficiently-specified butts forever).