In this discipline, referred to as Music Information Retrieval (or MIR for short), the topic is not so much to understand and model music (like in the field of music cognition), but to design robust and effective methods to locate and retrieve musical information, including tasks like query-by-humming, music recommendation, music recognition, and genre classification.
A common approach in MIR research is to use information-theoretic models to extract information from the musical data, be it the audio recording itself or all kinds of meta-data, such as artist or genre classification. With advanced machine learning techniques, and the availability of so-called ‘ground truth’ data (i.e., annotations made by experts that the algorithm uses to decide on the relevance of the results for a certain query), a model of retrieving relevant musical information is constructed. Overall, this approach is based on the assumption that all relevant information is present in the data and that it can, in principle, be extracted from that data (data-oriented approach).
Several alternatives have been proposed, such as models based on perception-based signal processing or mimetic and gesture-based queries. However, with regard to the cognitive aspects of MIR (the perspective of the listener), some information might be implicit or not present at all in the data. Especially in the design of similarity measures (e.g., ‘search for songs that sound like X’) it becomes clear quite quickly that not all required information is present in the data. Elaborating state-of-the-art MIR techniques with recent findings from music cognition seems therefore a natural next step in improving (exploratory) search engines for music and audio (cognition-based approach) (cf. Honing, 2010).
A creative paper, discussing the differences and overlaps between the two fields in dialog form, is about to appear in the proceedings of the upcoming ISMIR conference. Emanuel Bigand –a well-known music cognition researcher–, and Jean-Julien Aucouturier –MIR researcher–, wrote a fictitious dialog:
“Mel is a MIR researcher (the audio type) who's always been convinced that his field of research had something to contribute to the study of music cognition. His feeling, however, hasn't been much shared by the reviewers of the many psychology journals he tried submitting his views to. Their critics, rejecting his data as irrelevant, have frustrated him - the more he tried to rebut, the more defensive both sides of the debate became. He was close to give up his hopes of interdisciplinary dialog when, in one final and desperate rejection letter, he sensed an unusual touch of interest in the editor's response. She, a cognitive psychologist named Ann, was clearly open to discussion. This was the opportunity that Mel had always hoped for: clarifying what psychologists really think of audio MIR, correcting misconceptions that he himself made about cognition, and maybe, developing a vision of how both fields could work together. The following is the imaginary dialog that ensued. Meet Dr Mel Cepstrum, the MIR researcher, and Prof. Ann Ova, the psychologist.”Aucouturier, J., & Bigand, E. (2012). Mel Cepstrum & Ann Ova: The Difficult Dialog Between MIR and Music Cognition. Proceedings of the 13th International Society for Music Information Retrieval Conference, 397-402.
Honing, H. (2010). Lure(d) into listening: The potential of cognition-based music information retrieval. Empirical Musicology Review, 5(4), 121-126.
Volk. A., & Honingh, A. (eds) (2012). Special Issue: Mathematical and Computational Approaches to Music: Three Methodological Reflections Journal of Mathematics and Music, 6 (2). 10.1080/17459737.2012.704154