LaTeX and \l and generated pdf and suppressing chars
Norbert Preining
norbert at preining.info
Thu Jun 20 16:37:21 CEST 2024
Hi Ulrike,
On Thu, 20 Jun 2024, Ulrike Fischer wrote:
> Well I do with Adobe reader on windows. Copy&paste is a processor
> feature, and the exact result depends on the viewer you use. They
Ouch indeed indeed. Interesting to see that okular does the "right"
thing in this case.
> Yes I know this. But you have to accept reality: the PDF has a flaw
> and copy&paste won't work correctly. So how are you hoping to repair
> that without changing the encoding?
I cannot repair 2.5M papers from the past. But I can make text
extraction recognize the "suppress" special name and deal with it ;-)
I am not really concerned about copy/paste, but text extraction via
pdfminer, which parses the PDF.
Thanks Ulrike, that all helped me a lot understand a few of the moving
parts. I guess my next step is forking pdfminer(.six) (which development
seems to be dead anyway) and add try to fix what I see.
Thanks again, and all the best
Norbert
--
PREINING Norbert https://www.preining.info
arXiv / Cornell University + IFMGA Guide + TU Wien + TeX Live
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
More information about the texhax
mailing list.