[texhax] Questions about Bibtex page range setting

Pierre MacKay pierre.mackay at comcast.net
Fri Jun 1 20:50:42 CEST 2012


On 06/01/2012 11:11 AM, Thomas Schneider wrote:
> Ruan:
>
>> Well, I do not just want to keep it in the database. I am seeking
>> for a method to generate a single dash in my final output, since
>> latex will always put two dashes in the pdf file. Do you have any
>> suggestions?
> Not being an expert on these things, I created a PDF with an entry
> from a double dash in the bib database.  This comes out as a single
> dash when cutting and pasting.  It's a unicode double byte character.
> It does NOT look like two dashes in Skim on Mac OS X.  I get 3 unicode
> bytes cutting from an Acrobat display.  In acrobat it looks and
> handles like one character.
>
>
The difference in byte-lengths for U+2012 Figure Dash (or U+2013 En 
Dash---the two are distinguished by function rather than appearance) is 
caused by the fact that Mac stubbornly insists on continuing to use 
16-bit UTF-16, while most of the rest of the world has adopted the more 
general UTF-8 which works on just about anything and has some one-byte 
lengths, some two and a great deal of three, after which it is entirely 
capable of moving on to four.  I prefer to use Figure dash for instances 
like 2--3 (this mailer will not code for either figure dash or en dash), 
showing a range, but it is unfortunately not widely in fashion.  All Mac 
characters until you get into the stratospheric limits of Unicode fit in 
a 16-bit length, but that also means that all ASCII characters have a 
00000000 byte starting them, which doubles the length of an ordinary 
non-accented Latin text.  UTF-8 allows one to avoid the sometimes 
problematic appearance of a 00000000 byte in text.  UTF-8 coding is 
darned near brilliant.

Pierre MacKay


More information about the texhax mailing list