DIHYPH / DITECT Unicode-Version




Character coding

Using editable code table "dhco??" or "dtco??" (?? =language-no. 01-40)
DIHYPH and DITECT are working with 1 byte character code, so the 2-byte
UNICODE is internally converted into ISO-8859 code by calling:

DMCREPUC (1, nn, ptu, pti);

If needed ISO-code is converted back to UNICODE by calling module:

DMCREPUC (2, nn, ptu, pti);
              |   |    |__ pointer to ISO-code string  (char)
              |   |_______ pointer to UNICODE    " (unsigned short)
              |___________ language-no. (e.g. 02 = UK-English)

When calling DIHYPH or DITECT all UNICODE strings must start without BOM (byte
order mark).
To tell the program wether a UNICODE string has 'LE' (little endian byte order)
or 'BE' (big endian byte order), switch "unibom = 1;" is defined in DHDEF.C
value 1 meaning 'BE' and value 2 meaning 'LE'.
Before implementing the software be sure to have set this switch correctly !

Module DMCREPUC.c uses code conversion file DMCREPUC



DITECT proposal word list

The proposal word list with UNICODE characters is stored into array
unsigned short prbufuc[1000] (20 lines of 50 unsigned short values)
while array char prbuf[1000] holds 20 lines of 50 ISO-characters.
First value in every prbufuc line shows the percentage while the proposal
string starts with second unsigned short value.
Every proposal word in prbufuc[] ends with CRLF.



DIHYPH / DITECT UNICODE-Programs


UNICODE-programs differ from standard DIHYPH / DITECT as follows:

All UNICODE-modules have the same type of name:

dx????uc
 |  | |____  UNICODE version
 |  |
 |  |______  ???? unspecific letters
 |
 |_________  h  = for DIHYPH program
 |_________  t  = for DITECT program
 |_________  m  = for DITECT and DITECT


To link DIHYPH or DITECT according to installation description, following
programs and files have to be used for UNICODE:

file           instead of     for
__________     __________     _______________________

DHTESTUC.c     DHTEST.c       DIHYPH-testprogram   *)

DTTESTUC.c     DTTEST.c       DITECT-testprogram   *)

DHYPHEUC.c     DHYPH.c        DIHYPH

DTECTEUC.c     DTECT.c        DITECT

DMRDWTUC.c     (additional C-source)              **)

DMCREPUC.c     (additional C-source)

DMCREPUC       (additional code file)

*) used for testprograms only !
**) used for Exception and testprograms !



DIHYPH has to be called by text-/publishing system by:

rc = DHYPH (luni, nn);
             |    |____  (integer)  language-no.  1 - 40
             |_________  (unsigned short) UNICODE character string.


DITECT has to be called by text-/publishing system by:

rc = DTECT (luni, nn);
             |    |____  (integer)  language-no.  1 - 40
             |_________  (unsigned short) UNICODE character string.




DIHYPH / DITECT test programs:


d?testuc  nn  file
 |        |    |_____  UNICODE file with test words (1 per line)
 |        |
 |        |__________  language-no.  1 - 40
 |
 |___________________  h  = DIHYPH hyphenation
 |___________________  t  = DITECT spelling check




Contact