Project by Laboratoire Parole et Langage, Aix-en-Provence, France.
SPPAS is a tool to produce automatically phonetic annotations from a recorded speech sound and its transcription. The whole procedure is a succession of automatic steps. Result is a set of TextGrid files. SPPAS is open source software issued under the GNU Public License.
SPPAS is currently designed for French, English, Italian and Chinese and there is an easy way to add other languages.
Operating systems are: Linux, MacOS and Windows.
|
|
|---|
|
|
|---|---|
|
|
The input file is the following word-tokenized ASCII text:
This is the oriana1.txt file. It is located in the samples/samples-EN directory included in the SPPAS package. The corresponding wav file is oriana1.wav in the same directory.
If a file oriana1.hz or oriana1.PitchTier exists in the same directory as oriana1.wav, the Momel and INTSINT annotations can be activated. It produces a oriana1.momel.TextGrid file with 2 point tiers.
The Inter-Pausal Units segmentation takes as input the oriana1.txt and the oriana1.wav files.
The output oriana1.TextGrid file contains one tier named IPU. This segmentation depends on the number of silences indicated in the transcription by newlines and/or the '#' symbol.
The phonetization takes as input the oriana1.TextGrid file. This phonetization is performed only on the basis of a dictionary that can be manually changed.
It creates the oriana1-phon.TextGrid file and oriana1-tokens.TextGrid. By convention, spaces separate words, dots separate phones and pipes separate phonetic variants of a word. For example, the sentence I never get to sleep on the airplane will be phonetized as: ay n.eh.v.er g.eh.t|g.ih.t t.uw|t.ix|t.ax s.l.iy.p aa.n|ao.n dh.ax|dh.ah|dh.iy eh.r.p.l.ey.n
The alignment performs a phonetic segmentation from the oriana1-phon.TextGrid, oriana1-tokens.TextGrid and oriana1.wav files. Alignment is performed on the basis of an acoustic model and on both phonetization and transcription files. The Julius speech recognition engine is used to estimate the temporal boundaries of each word, its phonetization and the temporal boundaries of each phoneme of that word.
It creates the oriana1-phon.palign.TextGrid file which includes the phoneme alignments, and the word alignments.
The last automatic step is syllabification. This step is available only for French and Italian.
This syllabification is performed on the basis of the phoneme alignments with a rule-based system. Two main principles are followed: 1. there is one and only one vowel per syllable, and 2. pauses are syllabic boundaries. Rules are fixed to split the consonant clusters.
This is the list of output files:
This is the final result, with TextGrid files merged in Praat: