Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/german_compound/README
Дата изменения: Wed Jan 17 14:09:05 2007
Дата индексирования: Sat Dec 22 10:29:44 2007
Кодировка:
The german dictionaries from OpenOffice.org are missing the compound flags necessary for recognising words that can be compounds of other words.
German language depends heavily on compound words such as "HaustЭr", which need to be lexized
as "'tЭr':1 'haus':1 'haustЭr':1" by tsearch in order to yield 'haustЭr' when searching for 'tЭr'.

Aditionally the german language has another quirk: The "Fugen-S", an s that gets inserted between compound words such as in "Vertragsstrafe".
Other such insertions are "n" ("KЭchenmesser"), "es" ("Haaresbreite"), and "en" ("BДrendienst")

compound.pl adds the /_ Flag to all words that are parts of compound words.
It additionally adds one of the flags x,y,z for the insertions descibed above, see the code for more info.
It searches the input dictionary word by word and checks if the given entry is part of another word
e.g. 'tЭr' will be found in 'haustЭr', thus creating the entry tЭr/_ or vertrag/_x in the dictionary.

Usage:
compound.pl -i german.dic -o german.dict

where german.dic is the dictionary file from OpenOffice.org processed by my2ispell and german.dict is the compound-extended version.

The additional files are:
german.stop - Stopwordfile we use
german.aff - Affix file with the correct compound marker and suffix flags for compound insertions
german dict - ouput of compound.pl