Home Interfacing  (DIHYPH) Installation (DIHYPH) Default Values (DIHYPH) Testprogramm (DIHYPH)      Exeption Dictionary Unicode-Version UTF-8 Version Kontakt
Testprogramm (DITECT)      
Interfacing (DITECT) Installation (DITECT) Default Values (DITECT) Test program (DIHYPH)       
Test programm (DITECT)     

DIHYPH / DITECT Exception Dictionary





D I H Y P H exception writing



An exception may have a minimum length of 3 and a maximum length of 49 characters.

Only letters (no special characters !) are permissible in the first two positions
of an exception, except - and ' which are accepted in 2.-position
(e.g., A-bend or c'est ).

The abbreviation asterix * may only be placed at the end of a word.
Exceptions ending on an asterix * mean that up to the * every hyphenation is
provided as specified. All the succeeding characters of the word are processed by
hyphenation logic starting at the last hyphen found in the exception.
Example: lau-f* ==> lauf  lau-fe  lau-fen  lau-fend  lau-fen-den

but:            ==> lau-fsper-re (and other errors !!)

To avoid errors like those created by exception writing, in exceptions with less
than two letters between the last hyphen and the concluding *, the program logic
of some languages will check whether these syllable-start-letters are allowed.
If not, the last hyphen is corrected automatically:
          lau-fsperre ==> lauf-sperre   (has been corrected)

Example:  Kam-bo-d*   ==> Kam-bod-scha  (has been corrected)

    but:  Kam-bo-ds*  ==> Kam-bo-dscha  (has been accepted )

An exception not ending with an * will be found if it is equal, or up to three
characters shorter than the word to be hyphenated.
Example:  lau-f        ==>  lauf  lau-fe  lau-fen  lau-fend.

but not:                    laufende      (4 characters longer !).

The length of an exception ending with ) must be equal to word length.
Example:  pre-cede)    ==>  pre-cede

but not:                    prec-e-dence,  pre-ced-ent



The following characters are permitted in exceptions
Letters:

xXyY      All characters are allowed in both upper and lower case.

Special characters:

*     exception abbreviation :  lau-f*       ==>  lau-fen-den
)     exception length fixed :  lauf)        ==>  lauf   but: lau-fe
-     hyphen                 :  Tren-nun-gen ==>  Tren-nun-gen
+     hyphen with addition   :  Voll+a*      ==>  Voll-la-den...
#     hyphen with elimination:  Taxie#tje    ==>  Taxi-tje
.     any vowel              :  Zei-t.n      ==>  Zei-ten  Zei-tung
,     any consonant          :  lo-,en       ==>  lo-ben   lo-ten
'     apostrophe             :  c'est

Numbers 1 to 5 may be inserted instead of, or following the hyphen
in order to define the hyphenation and its quality ranking.
Example:                  auto1mo5bile

     or:                  auto-1mo-5bile

Hyphen characters + and # may not be replaced by quality ranking numbers.
Definition of their quality ranking is done by logic automatically !

Example
exception     produces  following word splitting
------------------------------------------------

vill+ad       vill-laden   vill-ladung     ( 3rd  L  is added)
              but:         vil-la-den-de   ( > 3 characters follow !)
                           vel-la-den      ( other spelling !)

vall+ad*      vall-la-den-der              ( 3rd  L  is added,
                                            remainder hyph. by logic)

taxie#tje     taxi-tje                     ( elimination  of e )
              but:         te-xiet-je      ( other spelling !)

st.,,-        stopp-end    Stach-eln
              but:         spuc-ken        ( other spelling !)
                                           ( German c-k is changed
                                             to k-k by textsystem !)

The special hyphen signs + and # are language-dependent !

Letter addition + only operates for languages with corresponding grammatical
rules e.g. in German.

Character elimination # only works in Dutch.




D I T E C T exception writing



Correct initial capital and small letter writing is important:
In the 1. position of words only letters are allowed.
In the 2. position of words or later seven other characters are allowed:
Blank   -   /   *   #   .   '

With the exception of an asterix (*), dot (.) and slash (/), special characters
within words are treated as part of the word and have no special meaning.
If an abbreviation ends with a period, it must be stored that way, otherwise
DITECT is unable to determine whether the period marks the end of an abbreviation
or the end of a sentence !

A word may be stored into exception-
file(s) for the following reasons:              Writing style
--------------------------------------          -------------

1. "abcd" is unknown to DITECT                  abcd

2. "crude"                 is unwanted          crude/
   "crude" or "crudely"    is unwanted          crude#
   "crude" or "crudeness"  is unwanted          crude*

3. "Photo"  is unwanted                         Photo/Foto/
3. "Photo" or "Photos"     is unwanted          Photo/Foto/#
   "Photo" or "Photograph" is unwanted          Photo/Foto/*

4. Special expression "qrst"                    qrst/uvwx/.
   to be replaced by "uvwx" automatically                   

Explanations
to 1:
Words unknown to DITECT are stored into exception file(s) one expression
per line.

to 2:
If an expression is unwanted by the user or it is not recognized as incorrect by
DITECT, the user may reject it by storing it into the exception dictionary with
a concluding * # or /
That part of word preceding * or # may be a complete or an abbreviated word.
With ending # an abbreviation is limited to 2 more letters, while with ending *
an abbreviation is unlimited, e.g.
crude# causes that crudely is marked as incorrect but not crudeness, while
crude* causes all words starting with crude to be marked as incorrect.

to 3:
For every incorrect expression found by DITECT, a list of proposed substitute
words is displayed, and the user may click on one of them for a replacement.
But
- when DITECT is unable to display the correct proposal or
- when the user doesn't whish to search a list of several proposals,
such a refused expression (e.g: Photo) may be expanded by a proposal
(e.g: Foto) like this: Photo/Foto/*
With ending * or # the proposal is expanded (if necessary) and is displayed in
the proposal list.
So e.g. with text word Photoatelier the proposal Fotoatelier will be displayed.

to 4:
Same as described in 3 above, but now the ending is a period (.) instead of an
asterix (*) and the first expression "qrst" is replaced by the publishing-
system automatically by the second expression "uvwx" !


Sample Exceptions       Meaning:
-----------------       -----------------------------------------------

Dr.                     Correct abbreviation of the word "Doctor"

pj's                    Correct abbreviation

Louisville              Correct name of city unknown to DITECT.

Photograph              Accepted word instead of the following

Photog*                 unwanted writing of "Photog..."

Kusine/Cousine/*        "Kusine" is unwanted,  "Cousine" is proposal.

am Besten/am besten/*   "am Besten" is incorrect, "am besten" proposed.

fc/fan club/.           "fc" is automatically replaced by "fan club" *)

Barbra Streisand        "Barbra" is allowed if followed by "Streisand"
-----------------       ----------------------------------------------- 
*) The replacement must be done by the publishing system, not by DITECT !




DIHYPH / DITECT exception dictionaries



During word processing several exception files may be used simultaneously:

Standard file is always searched first (if not switched off).

DIHYPH file names must always start with letters dh e.g. "dhspec
DIHYPH standard exception file automatically used is dhex??.txt

DITECT file names must always start with letters dt e.g. "dtspec
DITECT standard exception file automatically used is dtex??.txt
(?? = language-no.).


Special exception file is searched after a word could not be found in standard
exception file.
Special exception file name is defined by user and stored into array "exfile[ ]"
by textsystem (see: DHDEF.C or DHEXT.H) without extension !
The first special exception file name defined is preceded by key character + or =
"+filename"     Standard file plus this file are used.

"+"             Standard file only is used.

"=filename"     Only this file is used without standard file.

"="             No exception file is used. 

Several special exception files are seperated by 'space'-character:

"+file1 file2" Standard file + file1 + file2 are used in this sequence.

"=file1 file2" Only file1 and file2 are used.

Switching on/off exception files during runtime is only possible if the special
exception file(s) are defined before first calling the exception program as follows:
"+file1 file2"

To change access to base- or special-exception files, the following values may
be set into array "exfile[ ]" during runtime after the first call:
"++"   Base- plus special-exception file(s) are used.

"=+"   Only  the  special-exception file(s) are used.

"+"    Base              -exception file is used only

"="    No                 exception file is used.

Every change of program (DIHYPH, DITECT or language) is treated as first call!



DIHYPH / DITECT exception programs:



DMEXCAT.EXE creates an access catalog "file.CAT" for text file "file.TXT".
It may be used for DIHYPH- or DITECT- exception files and it handles
1-byte-code as well as Unicode files.

For Unicode following program modules are needed:
DMRDWTUC.C
DMCREPUC.C together with file
DMCREPUC

If only 1-byte-code is used these three DM????UC files may be replaced by
dummy file DMCREPUD.C

DMEXCAT is started with following parameters:

Program call              Meaning
----------------------    ---------------------------------------------
DMEXCAT nn h              for DIHYPH ( b DHEXnn.TXT  =Default )
DMEXCAT nn t              for DITECT ( b DTEXnn.TXT  =Default )
DMEXCAT nn h b 1B-File    for DIHYPH with 1-Byte-Code special file
DMEXCAT nn t u UC-File    for DITECT with Unicode special file
DMEXCAT nn t 8 U8-File    for DITECT with Utf-8-code special file

         | | |   |_____   Name of special exception file (without .txt)
         | | |
         | | |_________   b = 1-Byte-Code exception file (default)
         | |              u = Unicode     exception file
         | |              8 = Utf-8 code  exception file
         | |
         | |___________   h = DIHYPH exception file
         |                t = DITECT exception file
         |
         |_____________   nn= language-no. 1-99

Before the catalog is created DMEXCAT.EXE does following operations:
exceptions are automatically checked and wrong words are marked with an error
message enclosed in ' ... '.
Finally an appropriate message is displayed.
Using a text editor's "search-function", errors enclosed in ' may now be jumped
at directly to correct the errors (the 'error message' adapted to a word is later
eliminated by the program automatically).

Double storing of exceptions is automatically suppressed by the program.

When no errors are found, exceptions are stored ready-sorted and a direct-
access catalog (filename.CAT) is created to be used by DIHYPH or DITECT.


DMEXINCT.C may be used by the publishing system to immediately insert a
new exception word in exception file and to automatically create the catalog.
This tool must not be used by clients when exception file is on the server !

Header lines of DMEXINCT.c file show how to use the program.




Contact