[texhax] Questions about Bibtex page range setting
Pierre MacKay
pierre.mackay at comcast.net
Fri Jun 1 20:50:42 CEST 2012
On 06/01/2012 11:11 AM, Thomas Schneider wrote:
> Ruan:
>
>> Well, I do not just want to keep it in the database. I am seeking
>> for a method to generate a single dash in my final output, since
>> latex will always put two dashes in the pdf file. Do you have any
>> suggestions?
> Not being an expert on these things, I created a PDF with an entry
> from a double dash in the bib database. This comes out as a single
> dash when cutting and pasting. It's a unicode double byte character.
> It does NOT look like two dashes in Skim on Mac OS X. I get 3 unicode
> bytes cutting from an Acrobat display. In acrobat it looks and
> handles like one character.
>
>
The difference in byte-lengths for U+2012 Figure Dash (or U+2013 En
Dash---the two are distinguished by function rather than appearance) is
caused by the fact that Mac stubbornly insists on continuing to use
16-bit UTF-16, while most of the rest of the world has adopted the more
general UTF-8 which works on just about anything and has some one-byte
lengths, some two and a great deal of three, after which it is entirely
capable of moving on to four. I prefer to use Figure dash for instances
like 2--3 (this mailer will not code for either figure dash or en dash),
showing a range, but it is unfortunately not widely in fashion. All Mac
characters until you get into the stratospheric limits of Unicode fit in
a 16-bit length, but that also means that all ASCII characters have a
00000000 byte starting them, which doubles the length of an ordinary
non-accented Latin text. UTF-8 allows one to avoid the sometimes
problematic appearance of a 00000000 byte in text. UTF-8 coding is
darned near brilliant.
Pierre MacKay
More information about the texhax
mailing list