[Q] TFM files headers
Doug McKenna
doug at mathemaesthetics.com
Fri Sep 13 19:14:38 CEST 2019
Here's my take:
If there are extra 32-bit words past the end of the standard TFM data in the header, there is documentation for a Xerox PARC bunch of info, which starts out with two BCPL identification strings stored in a 40-byte and then a 20-byte slot, respectively.
A BCPL string starts with a length byte for the remaining bytes in the string. Everything after that number of bytes in each field should be ignored, and therefore can be garbage or nulls. There is no requirement that it be null padding that I recall (or that any TFM parsing code should care about). From the point of view of data integrity, though, within the string a null byte is technically legal, but not a particularly good idea. Also, both 40 and 20 are artificially low values, so a bad length byte could be out of range. Regardless, a null in the legal part of the string will cause problems for any parser that might try to record it as a C string.
But all of that is only a problem if one is assuming the extra data past what TFM cares about is in that Xerox-added format. There's no way I can discern to tell different private extended formats apart. I don't know about any other documented formats in this extra space, but that doesn't mean there aren't any.
Doug McKenna
From: "Tom Rokicki" <rokicki at gmail.com>
To: "Didier Verna" <didier at didierverna.net>
Cc: "texhax" <texhax at tug.org>
Sent: Friday, September 13, 2019 10:19:20 AM
Subject: Re: [Q] TFM files headers
Oh, dang, looks like the TFM file format does not require bc <= ec.
The amsfonts/dummy example is loaded by TeX and also accepted
by tftopl. It does require however that bc <= ec+1 and that ec<256.
The tftopl check reads as follows:
if (bc>ec+1)or(ec>255) then abort('The character code range ',
In TeX we see:
if (bc>ec+1)or(ec>255) then abort;
if bc>255 then {|bc=256| and |ec=255|}
begin bc:=1; ec:=0;
end;
That second part is interesting and unexpected; I believe it's there just so
a tfm file that is empty but using bc=256/ec=255 to indicate that, can still
work in an environment where bc and ec are stored in 8-bit bytes.
TeX also explicitly ignores extra stuff at the end:
@ We check to see that the \.{TFM} file doesn't end prematurely; but
no error message is given for files having more than |lf| words.
-tom
On Fri, Sep 13, 2019 at 9:04 AM Tomas Rokicki < [ mailto:rokicki at gmail.com | rokicki at gmail.com ] > wrote:
I also noticed the tfm files with the wrong length (extra stuff on the end).
While this is moderately unfortunate it's also not critical since all the
TFM parsers appear to read the TFM files front to back and just ignore
the junk.
I confirm (with my own TFM parser) that there are 2060 "tfm" files in the
distribution with bc > ec. They are in the following directories (with counts
of the "bad" files). These files may also have other issues (like bad
header lengths). My suspicion is nobody has ever (successfully) used
any of these fonts. A cursory test appears to show that TeX cannot load
these fonts either.
24 /usr/local/texlive/2019/texmf-dist/fonts/tfm/ptex-fonts/standard
12 /usr/local/texlive/2019/texmf-dist/fonts/tfm/ptex-fonts/nmin-ngoth
8 /usr/local/texlive/2019/texmf-dist/fonts/tfm/ptex-fonts/jis
4 /usr/local/texlive/2019/texmf-dist/fonts/tfm/ptex-fonts/dvips
1080 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/japanese-otf
522 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/japanese-otf-uptex
40 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/jlreq
1 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/amsfonts/dummy
24 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/zhmetrics-uptex
260 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/pxufont
12 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/hfoldsty
25 /usr/local/texlive/2019/texmf-dist/fonts/tfm/public/morisawa
40 /usr/local/texlive/2019/texmf-dist/fonts/tfm/uptex-fonts/jis
8 /usr/local/texlive/2019/texmf-dist/fonts/tfm/uptex-fonts/min
On Fri, Sep 13, 2019 at 4:13 AM Didier Verna < [ mailto:didier at didierverna.net | didier at didierverna.net ] > wrote:
BQ_BEGIN
"Taylor, P" < [ mailto:P.Taylor at rhul.ac.uk | P.Taylor at rhul.ac.uk ] > wrote:
> What (if anything) does TFtoPL have to say about the files that your
> parser identifies as being non-compliant ? Philip Taylor
It agrees with me :-D
--
Resistance is futile. You will be jazzimilated.
Lisp, Jazz, Aïkido: [ http://www.didierverna.info/ | http://www.didierverna.info ]
--
-- [ http://cube20.org/ | http://cube20.org/ ] -- [ http://golly.sf.net/ | http://golly.sf.net/ ] --
BQ_END
--
-- [ http://cube20.org/ | http://cube20.org/ ] -- [ http://golly.sf.net/ | http://golly.sf.net/ ] --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/texhax/attachments/20190913/5151729b/attachment.html>
More information about the texhax
mailing list