[texhax] makeindex breaks up index group on a capitalized entry
geolsoft at mail.ru
geolsoft at mail.ru
Mon Aug 23 15:10:26 CEST 2004
On Mon, Aug 23, 2004 at 12:57:23PM +0200, Stepan Kasal wrote:
> I have extracted the makeinfo sources from the huge
> tetex-src-2.0.2.tar.gz, so if you want to experiment with the source,
> you can get it---it's just 61 KB, and makeindex is very independent
> on other tetex componenets.
and later:
> You can try this:
>
> #define TOLOWER(C) ( (unsigned char)tolower((unsigned char)(C)) )
>
> and thus eliminate the usage isupper().
>
> Or you can hardcode your own version of TOLOWER or first_letter,
> suitable for KOI8.
> This way, you should get a working makeindex program.
I did that, and did some experimenting. I actually had
written a small test program:
#include <ctype.h>
#include <stdio.h>
#include <locale.h>
int main(void)
{
setlocale(LC_ALL, "");
putchar(tolower('M'));
return 0;
}
where the parameter to tolower() was actually an uppercase
Cyrillic `M', and it did downcase it (with
LC_ALL=ru_RU.KOI8-R). But when I added:
#define TOLOWER(C) ( (unsigned char)tolower((unsigned char)(C)) )
to the makeindex source, as you recommended, it still did
not do any good. So I checked the source you provided for
makeindex, and this is what I came up with:
- sortid.c sets current locale only temporarily (for
LC_COLLATE) while doing the sorting, and resets `old'
locale thereafter;
- genind.c does not set the locale at all when generating
the output.
Thus, tolower() is not locale-enabled during index
generation, hence the breaking of the group.
Below I include the patch which seemed to work out the
problem for me. Note that it changes locale for LC_CTYPE
and then resets it back for each of the index entries. I am
not sure that it could be moved upper for optimization
purposes, without breaking something else. Could you please
look at it, or recommend somebody else to whom I should
submit it?
Many thanks,
Oleg Katsitadze
P.S. I also posted this message to tex-eplain at tug.org,
tex-k at tug.org, and texhax at tug.org, in case somebody has any
comments to make.
--- makeindexk/genind.c 2002-10-02 15:19:22.000000000 +0300
+++ makeindexk.new/genind.c 2004-08-23 15:59:36.000000000 +0300
@@ -28,6 +28,10 @@
#include "mkind.h"
#include "genind.h"
+#ifdef HAVE_LOCALE_H
+#include <locale.h>
+#endif
+
static FIELD_PTR curr = NULL;
static FIELD_PTR prev = NULL;
static FIELD_PTR begin = NULL;
@@ -219,6 +223,10 @@
{
int let;
FIELD_PTR ptr;
+#ifdef HAVE_SETLOCALE
+ char *prev_locale;
+ prev_locale = setlocale(LC_CTYPE, "");
+#endif
if (in_range) {
ptr = curr;
@@ -246,6 +254,10 @@
make_item(NIL);
} else
make_item(delim_t);
+
+#ifdef HAVE_SETLOCALE
+ setlocale(LC_COLLATE, prev_locale);
+#endif
}
More information about the texhax
mailing list