A research project in computational musicology (1994-98)
Bernard Bel <bel(at)lpl.univ-aix.fr>
Centre National de la Recherche Scientifique
(CNRS, France)
Laboratoire Parole et Langage (LPL,
Université de Provence)
Listen to music (composed by Harm Visser on Bol Processor BP2)
The focus of this project is on musical cognition in the sense of psychologists: it is aimed at understanding the dynamic processes underlying the perception, performance and composition of music, three types of activity that are deeply interwoven in music improvisation.
|
Splitting across frets on the sarasvati
vina |
From 1979 onward I started working in the field of Indian musicology with a team of musicologists, anthropologists and musicians based in New Delhi (the International Society for Traditional Arts Research, a non-profit institution). Our project received support from the International Fund for the Promotion of Culture (UNESCO), the Sangeet Research Academy (SRA, Calcutta), the Ford Foundation in India and the National Centre for the Performing Arts (NCPA, Bombay). |
Initial work was dealing with the experimental study of raga intonation. In this context I built an electronic keyboard instrument (the Shruti Harmonium) able to handle programmable scale intervals and a precise real-time melograph (the Melodic Movement Analyzer MMA) which are now operative at NCPA, Bombay.
From 1982 onward I worked with ethnomusicologist Jim Kippen to study computational models of tabla improvisation, notably qa'ida. For this I developed a production-rule system (Bol Processor BP1) which was used in field work on portable Apple IIc computers. An aspect of this work was to elicit (hence to preserve) musical knowledge that so far had been passed verbally from master to disciple; this may be compared to the study of languages for which no grammar and no lexicon are readily available. At this stage we were also outlining hypotheses about mental processes underlying musical composition/improvisation.
Our research indicated that this "expert system" approach yields interesting results in teaching or demonstration situations whereas it brings poor results in the real context of musical performance. The problem is partly the complexity of performance models, to which we may add (as pointed out by Marvin Minsky) that the most efficient processes are "compiled" by experts so that, for instance, a beginner may find it easier to make his/her knowledge explicit than a trained musician. A solution lies in "decompiling" compositional processes, to the extent that traces of it can be captured in some experimental contexts. However, improvisation seems to involve knowledge that is still more compiled, therefore inaccessible. A workable hypothesis consists in postulating that improvisational activity is mainly a matter of recombining discrete elements (pre-compositional material); from that angle, the problem of eliciting improvisation schemata becomes combinatory and may be investigated with symbolic/numeric machine-learning techniques. We had success experimenting with the QAVAID (Question-Answer Validated Analytical Inference Device) inductive inference program able to build grammars from sample sets of rhythmic improvisation. Similar work in the field of melodic improvisation has been initiated in Pune by H.V. Sahasrabuddhe and Rajeev Upadhye.
Modelling melodic improvisation is more complicated so far as an acceptable sound output is expected. Whereas it is possible to reduce rhythmic material to sequences of discrete units (bol-s), it is more hazardous to reduce the melodic continuum of raga to a sequence of notes, e.g. the Sargam notation developed by Pt. V.N. Bhatkande in the early 20th century. Therefore, during the 1980's Joep Bor, Wim van der Meer and Issaro Mott worked out a transcription system that would take note connections into account. The transcription process could then be automated with the aid of the Melodic Movement Analyzer.
It is now envisageable to link these different approaches (to rhythmic and melodic improvisation) with the aim of deriving computational models of music improvisation in North India. These models will help in designing more informative experiments with performing artists, large scale studies of sound archives, and will eventually serve as a basis for the development of new tools for musical education and computer-aided composition.
In this project we are less interested in designing a prescriptive notation for the memorisation of raga music than in finding a discrete representation that will be suitable as an input to automated learning devices. Assessing improvisational models (see infra) implies a prior assessment of the transcription model, which was not questionable as long as the input data was taken from books.
|
An interesting idea by Rajeev Upadhye (CDAC, Pune) and H.V. Sahasrabuddhe (Pune University) consists in postulating a hierarchy of melodic intervals based on the principle of consonance to infer automata/grammars from sample sets of melodic data. I would like to contribute to this work in its attempt to deal with more detailed data collected from actual performances and refining validation methods. In this context it should be possible to assess the validity of the consonance paradigm and check other hypotheses as well. It will be necessary to improve inductive inference methods by introducing domain-dependent heuristics based on hypotheses about mental representations of musical structures (see for instance the QAVAID project). Hypotheses about musical time have been proposed by Kippen. Other hypotheses concerning motor structures (musical "gestures" in instrumental music) have been elaborated in the context of stringed instrumental music (see for instance works by John Baily). |
|
Automated procedures for the inference of grammars/automata make it possible to start from statements by expert musicians to build a model that they are able to validate by assessing the correctness of new productions; this is similar to a well-known technique called analysis by synthesis in sonetics. Bol Processor BP1 permitted a rigorous evaluation because the machine remained "invisible" in teaching and demonstration situations. In the case of melodic music, however, it will be necessary to design acceptable experimental protocols. I plan to develop techniques for the synthesis of melodic movements so that musicians will be able to assess the relevance of descriptive parameters identified in the analytical process.
(See presentation of BP2)
Work in collaboration with Indian musicologists, computer scientists and musicians has been an opportunity to envisage developments of BP2 in connection with similar work that has been undertaken in India. Of prior interest is the attempt by the CDAC team to develop software assistance for Indian "composers", thereby meaning the young generation of musicians who have access to electronic music environments. The challenge of Indian computational musicology is to provide tools for developing musical ideas in the Indian context while most current commercial music software is essentially based on 19th century western musical concepts. We therefore expect a very fruitful interaction in exchanging ideas about ideal task environments in the present day music practice of different cultural backgrounds.
This project belongs to cognitive anthropology, more precisely the dialectical method that was outlined by Blacking and made operational by Kippen and myself. The dialectical method consists in setting up an experimental environment in which models (of productions and productive activity) are elaborated and validated by informants (expert musicians).
The work we initiated with North Indian drum players has been a breakthrough in modern ethnomusicology which was based on participant observation. The new paradigm is the reference to a computational model. Our experimental method is altogether cognitive (based on reproducible experiments) and dialectical (with the active participation of experts). It highlights a kind of musical cognition which is not amenable to introspection or verbal description.
This project looks at musical reality from the angle of action sciences, thereby meaning that knowledge acquisition is achieved by transferring data from conversational domains (interactions between human experts and analysts) to the classical systematic domains (technology and formal tools for representing and archiving knowledge). Unlike Jean Molino's tri-partite semiotics of music, it does not imply the existence of musical "works" as products (the so-called "neutral" level). It is entirely aimed at eliciting productive and receptive musical activities. Challenging the existence of musical works as "frozen" material is also found in the field of written music, as indicated by recent work in historical anthropology of music.
(See : Portrait of an extra-terrestrian)
Limitations of computational models should be clear at once. It is known that each raga, as a "melodic species", should be studied in the context of its historical evolution. The fact that we attempt to characterize a compositional type from examples of acceptable productions means that we cannot take these evolutive factors into account. In addition, raga characterization makes full sense in connection with neighbouring ragas which may interact as "attractive" or "contrast-making" entities: should the main ragas be removed from the "map", the entire system would collapse. Although we have done a few steps towards automatic raga classification, it remains evident that definitions involving an unlimited number of features (i.e., relating to definitions of neighbouring ragas, and so forth) are not computational.
The Melodic Movement Analyzer (MMA) built in 1982 is hooked to a Fundamental Pitch Extractor (FPE) working on a very rudimentary principle: the sound signal is fed into a set of third-octave 4th order filters (Q = 10). Energy levels in each band are then compared, yielding a decision as to which band should contain the meaningful partial.
Later, Wim van der Meer developed a technique derived from the sub-harmonic summation (SHS) algorithm that makes it possible to deal directly with sampled sounds instead of using MMA and FPE. Results obtained in this way are generally close to the ones of MMA, but computation time remains critical. In addition, the accuracy of this technique is questioned when dealing with male vocalists with a strong background sound of tanpura. Better results are expected from digital prefiltering of the signal at frequencies that will be predicted from previous pitch values, and using information about (1) the raga and (2) the instrument or vocal technique. (For instance, it is easy to define the rare contexts in which a one-octave jump is allowed, which eliminates the major mistakes done by the SHS algorithm.)
The current version of my automata/grammar inference device (QAVAID) has been running under Prolog II. A new version will be implemented in compiled Prolog or C++. It will be included in BP2 so that inferred grammars may directly be checked for sound production. (Forthcoming versions of BP2 will also deal with 16bit sound files, so that acceptable imitations of tabla sounds may be used.)
QAVAID comprises a segmentation algorithm that determines unbreakable "units" found in the structures of improvised items. The success of the inductive process relies dramatically on the ability to identify the proper units. Efficiency of this algorithm may be improved by introducing automata that recognize all substrings of a given string (factor transducers).
In this and other programs it will be of prior importance to clearly separate domain-specific knowledge that has implications on generalization strategies in the domain under study.
It will be designed for standard MIDI equipment taking advantage of possibilities to control timbre and envelopes in the instrument. Synthesis of sound files using Csound will also be envisaged.
The knowledge-base for reconstructing accurate melodic lines will be taken from (1) analytical data captured by the MMA (using curve-fitting algorithms) and (2) from ad-hoc (raga dependent) interpretation rules proposed by musicians and scholars.
Contact: <bel(at)lpl.univ-aix.fr>