MULTEXT - Document MRC 1. MtRecode/Using.




logo

Using MtRecode








Contents



Running MtRecode

MtRecode is a unix type tool. It analyses its input and returns the result of the translation to the standard output. There are several options (the order has no importance). Even if there are default values, you must use, at least, one of the -in or -out options. The arguments are case insensitive.

Full description of options

-in [table_name or file_path]

This option is used to provide the name of the input character set or the file in which a new character set is defined. The following are the character sets provided with MtRecode, any of which can be used with this option: - iso_646 or ascii for ISO-646
- iso_8859_1 or latin1 or lat1 or l1 for ISO-8859-1
- iso_8859_2 or latin2 or lat2 or l2 for ISO-8859-2
- iso_8859_5 or cyrillic or cyr for ISO-8859-5
- iso_8859_7 or greek for ISO-8859-7
- iso_8859_8 or hebrew for ISO-8859-8
- iso_8859_9 or latin5 or lat5 or l5 for ISO-8859-9
- iso_8859_10 or latin6 or lat6 or l6 for ISO-8859-10
- easy_french or texte or easy_fr for Easy French
- mac or macintosh for Mac Roman

In order to define your own character set, see the way to create a new set in the customising " document. You can also use the -overin option to load your new set over another one. Default is ISO_646 (ASCII)

Example
mtrecode -in latin1 ...
mtrecode -in ./my_file

out [table_name or file_path]

This option is used to provide the name of the output character set or the file in which a new character set is defined. The character sets provided in the package are the same as those given above for -in. In order to define your own character set, see the way to create a new set in the customising " document. You can also use the -overout option to load your new set over another one. Default is ISO_646 (ASCII)

-setlist

When using this option, the tool returns all the pre-defined character set names for input or output. Note that for each character set, several alternative names are provided; these alternatives are given on the same line.

-upper

When using this option, letters returned by the tools are uppercase letters. You can use this option with the other one except the -lower option.

Example
mtrecode -in latin1 -out latin2 -upper

-lower

When using this option, letters returned by the tools are lowercase letters. You can use this option with the other one except the -upper option.

-unacc

When using this option, letters returned by the tools are unaccented letters. You can use this option with the other one.

-help

Prints help information for MtRecode.

-version

Prints the current version of the tool.

-overin [table_name or file_path]

If you want to load the new input set (which the resource is given by the -in option) over a predefined character set, you must use -overin and give the name of one of the default set. Default is ISO_646 (ASCII)

Example
mtrecode -in ./my_file -overin latin1
mtrecode -in ./my_file -overin ./my_file_2

-overout [table_name or file_path]

If you want to load the new output set (which the resource is given by the -fout option)over a predefined character set, you must use -overout and give the name of one of the default set. Default is ISO_646 (ASCII)

-fext [file_path]

This option is used to give a new external character set which resource is given by file_path.

-notover

By default, a new external set is loaded over the SGML entities set. This option disables this overload.

-ref [file_path]

This option is used to give a new reference table which resource is given by file_path. The way to create a new reference table is described inthe customising part of this documentation.

Error messages

As with functions of the string and ctype libraries, errors are not dealt with except for those which handle character tables.

WARNING (1): mtstr_lib : Building tree
   redeclared entity : 'entity_spell' !
This entity was still defined.

WARNING (2): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Bad number of fields or bad format
   Line not loaded !
The format of the indicated line in the indicated file is not the one the program expects.

WARNING (3): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Bad code format 'code'
   Line not loaded !
The format of the code of a character on the indicated line in the indicated file is not the one the program expects.

WARNING (4): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Redifined character 'char' !
The indicated character has been loaded before. The second read definition is loaded.

WARNING (5): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   The code 'code' for iso_10646 was used before !
When loading a conversion table, you defined two characters with the same ISO-10646 associated code. This could cause problems for translation.

WARNING (6): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Unknown entity : 'code' !
A problem appeared when loading an entity in an Entity Type mapping table.

WARNING (7): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Unknown code 'code' !
The code in ISO_10646 indicated was never defined in the reference table.

WARNING (8): mtstr_lib : mtstr_change_external_table or mtstr_change_external_table
   The table for the 'table_name' set of characters has not been loaded !
The name you gave to the function mtstr_change_external_table or mtstr_change_external_table does not exists. Load the table with this name (function mtstr_char_set) before asking for it.

WARNING (9): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
   Bad number of fields or bad format
   Line not loaded !
The number of fields in this line is not correct or one of the field is not in the expected format.

WARNING (10): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
   Redifined character 'code' !
In the reference table, a character is defined more than once (first field).

WARNING (11): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
   There is no comment for the character : \10646:code\ !
Indicates that there is no comment for one of the characters in the reference table. This is not a problem if you don't use the mtstr_comment function.

WARNING (12): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
  This code was not defined before : 'code' !
There is a multiply defined character.

WARNING (13): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
   Problem with the translation of 'code' !
The indicated code must not be in the appropriate format.

WARNING (14, 15 or 16): mtstr_lib : loading characters
   file 'file_path' (ref. table) : at line 'line_number' : 
   The code 'code' in upper (or lower or unacc) field was not loaded before !
The indicated code appears in the 'xxx' field (upper, lower or unaccented) of the reference table. But it was never defined in the first field. This WARNING is deactivated for the moment.

WARNING (17): mtstr_lib : loading characters
   There is already a table named 'table_name' !
You cannot load a new table over an old one. You must give a new name to a new table.

WARNING (18): mtstr_lib : loading characters
   You must give a file name to load the 'table_name' set of characters !
When you want to load a new table, you must give the file path where to find it.

WARNING (20): mtstr_lib : loading characters
   file 'file_path' (table_name) at line 'line_number' : 
   Unknown code 'code'
   Line not loaded !
A problem appeared when loading a 10646 coded character.

WARNING (21): mtstr_lib : mtstr_verify_links
   file 'file_path' (table_name) at line 'line_number' : 
   You cannot define more than 256 characters with their code.
   Line not loaded !
For the moment, it is impossible to define a character om more than 8 bits. So, the number of characters defined with a code must be under 256.

WARNING (24): mtstr_lib : mtstrh_translate :
   The character 'char' does not belong to 'table_name' !
mtstrh_translate cannot translate a character from the input set to the output set because it does not belong to the input set. (for example the ISO-8859-6 is incomplete and has less than 256 characters, so you cannot use one of these missing characters)

WARNING (25): mtstr_lib : mtstrh_translate :
   Unknown character 'char' !
The character does not belong to the External Table nor to the Current Table.

WARNING (26): mtstr_lib : mtstrh_translate :
   You cannot translate 'char' from 'table_1' to 'table_2' !\n
It is impossible to translate a character from the input set to the output set. For example, if a character exists neither in the Current Table nor in the External Table.

WARNING (27): mtstr_lib : mtstrh_translate :
   Bad input name (table_1) or output name (table_2) !
The input set name or output set name does not exist. You must load them before use them.

WARNING (28): mtstr_lib : mtstr_verify_char_links :
   Unknown table name : 'table_name' !
table_name is not defined in the current scope.

WARNING (29): mtstr_lib : mtstr_verify_char_properties :
   The character \10646:code\ does not have comment !
Indicates that there is no comment for one of the characters in the Reference Table. This is not a problem if you do not use the mtstr_comment function.

WARNING (31): mtstr_lib : mtstr_verify_char_properties :
   The character &#code; (comment)
   does not have all the properties !
There is a problem with the properties of a character. Look at your Reference Table.

WARNING (36): mtstr_lib : mtrecode :
   The charset 'set_name' does not exist !
The set you gave with the -in or the -out option does not exist.

WARNING (37): mtstr_lib : mtrecode :
   The lower and upper options cannot be used at the same time !
You cannot give to the tool both the options -upper and -lower.


Examples

From cyrillic to ascii

Suppose you have a text in Bulgarian and you want to translate it in ascii to mail it to somebody :

you can use mtrecode to translate the characters in ascii :

mtrecode -in cyrillic -out ascii

Then, you obtain :

From latin to cyrillic

Now, suppose you have this cyrillic text coded with the SGML entities, and you want to insert it in a text in latin1 :

If you return to the cyrillic characters to view the Bulgarian, you use

mtrecode -in latin1 -out cyrillic

Then, you can see the accented letters in French are translated in SGML entities since they cannot be represented in the ISO-8859-5 set of characters which is used for Bulgarian :




HTML3.2 Checked! | Top | Next | MtRecode | LPL/CNRS | MULTEXT |

Copyright © Centre National de la Recherche Scientifique, 1996.