EAGLES Text Representation subgroup

Expert Advisory Group on Language Engineering Standards
Corpus Group / Text Representation subgroup


EAGLES

Contact address

Nancy Ide, chair
Laboratoire Parole et Langage
Centre National de la Recherche Scientifique
29, Avenue Robert Schuman
13621 Aix-en-Provence Cedex 1, France

e-mail: ide@univ-aix.fr





The MULTEXT project and the EAGLES subgroup on Text Representation have joined efforts to develop a Corpus Encoding Standard (CES) optimally suited for use in language engineering, which can serve as a widely accepted set of encoding standards for European corpus work. The overall goal is the identification of a minimal encoding level that corpora must achieve to be considered standardized in terms of descriptive representation (marking of structural and linguistic information) as well as general architecture (so as to be maximally suited for use in a text database). It also provides encoding conventions for more extensive encoding and for linguistic annotation.


Reports

MULTEXT/EAGLES Corpus Encoding Standard: Background and Principles (postscript)
MULTEXT/EAGLES Corpus Encoding Standard

Other documents

EAGLES Workshop (Madrid, 18-20 Jan. 1996)
Invitation to Text Representation session
Transparencies postscript (ca. 600 Ko) or Powerpoint (Mac)

[construction] Page under construction