ICU4C, FreeBSD 5.3, U_REGEX_MISMATCHED_PAREN: A Fix
I've come up with a fix for the U_REGEX_MISMATCHED_PAREN problem in ICU running on FreeBSD 5.3. I removed an entire type, and some casts to that type. I've tested on FreeBSD, Red Hat 9, and Ubuntu Dapper Drake, all passing the included tests with flying colors.
I haven't figured out why yet, but the cast of
doOpenNonCaptureParen to EParseAction at
i18n/regexcmp.cpp:345 was yielding 0. While digging in to
this, I discovered that EParseAction was only used as the
type for the single parameter to
RegexCompile::doParseActions. (It's used in
rbbiscan.cpp, too, but that's unrelated code, with a separate
definition of the enum.) To reduce complexity, I tried passing a
int32_t to doParseActions and
stripping out the casts. That did the trick.
I assume that in 3.8, there will be a more comprehensive solution, but for now, this works.
The patches:
Comments: 1
Weekend Links of Interest
The ICU-Project has opened a ticket for the regex test failure problem I stumbled over back in October.
Mari of Watashi to Tokyo posted some links regarding shojin ryori, a Japanese style of food preparation founded in on Buddhist teachings. Tenzo kyokun is instructions from Dogen, founder of the Soto sect of Zen, to the cook at a Zen monestary. Fushukuhanpo is a set of directives, also from Dogen, on the method of taking meals. (There seems to be a lot of folderol involved in eating, when the Way of Eating should be to eat.) Most interesting, though, is the blog A Zen Priest's Kitchen. It's in Japanese, but it looks like fun to try to figure out the recipes.
Comments: 0
ICU Won't Build
Yesterday I embarked on bringing musashi up to date in preparation
for moving the mail server. This is part of Operation: My Freaking
Inbox
. Musashi is a FreeBSD machine, so after doing a
cvsup, I ran portupgrade. I haven't worried
about updating the entire system before now, since musashi wasn't doing
anything more than pushing packets back and forth. The upgrade ran all
day, and when it was done only about 42 of the 90 ports it worked on had
updated successfully. The others had been skipped or errored out, due
largely it seems to a test failure in the regex part of the International Components for
Unicode, an open source library created by IBM.
For each test in the regex suite, when ICU tries to open
the
regular expression using uregex_open, the status flag
U_REGEX_MISMATCHED_PAREN. Unfortunately, no one seems to
have had this problem before; Google knows nothing of it, and there are
no mentions in the ICU or FreeBSD mailing lists.
I downloaded the tarball from sourceforge, and it displays the same problem:
[michael@musashi intltest]$ LD_LIBRARY_PATH=../../lib:../../stubdata:../../tools/ctestfw:$LD_LIBRARY_PATH ./intltest -v regex
-----------------------------------------------
IntlTest (C++) Test Suite for
International Components for Unicode 3.6
-----------------------------------------------
Options:
all (a) : Off
Verbose (v) : On
No error messages (n) : Off
Exhaustive (e) : Off
Leaks (l) : Off
Warn on missing data (w) : Off
-----------------------------------------------
=== Handling test: regex: ===
TestSuite Regex---
TestSuite RegexTest:
RegexTest failure in RegexPattern::compile() at line 451. Status = U_REGEX_MISMATCHED_PAREN
RegexTest failure in RegexPattern::compile() at line 452. Status = U_REGEX_MISMATCHED_PAREN
RegexTest failure in RegexPattern::compile() at line 453. Status = U_REGEX_MISMATCHED_PAREN
[...]
RegexTest failure in RegexPattern::compile() at line 508. Status = U_REGEX_MISMATCHED_PAREN
RegexTest failure in RegexPattern::compile() at line 513. Status = U_REGEX_MISMATCHED_PAREN
RegexTest failure in RegexPattern::compile() at line 514. Status = U_REGEX_MISMATCHED_PAREN
RegexTest failure in RegexPattern::compile() at line 515. Status = U_REGEX_MISMATCHED_PAREN
Inspection of the code yields nothing obvious. Time to send a message to icu-support...
Comments: 0