Music Matters | A blog on music cognition: theory testing

Showing posts with label theory testing. Show all posts

Sunday, July 22, 2012

Is replication an issue in music cognition?

This week the 12th International Conference on Music Perception and Cognition (ICMPC) is being held in Thessaloniki, Greece. A week long hunderds of researchers will present their latest work in a dense program with five parallel sessions and four keynotes. Slightly overdone perhaps, but it shows the still growing and international interest in music cognition as a research topic.

On the first day there will be a symposium on 'Replication'. By way of introduction below a blog entry that was originally published in May 2010:

"In the last few years Web-based experiments have become an attractive alternative to lab-based experiments. Next to the advantages of versatility and the ecological validity of the results, Web-based experiments can potentially reach a much larger, more varied and intrinsically motivated participant pool. Especially in the domain of music perception and cognition it is important to probe a wide variety of participants, with different levels of training and cultural backgrounds.

Nevertheless, to get research published that takes advantage of the Internet is not straightforward. An important reason for the conservatism held by some journals in publishing results obtained with Web-based experiments is the issue of replicability. Especially in the fields of experimental psychology and psychophysics there are serious concerns about the (apparent) lack of control one has in Web experiments as opposed to those performed in the laboratory. Where in the lab most relevant factors, including all technical issues, are under control of the experimenter (i.e. have a high internal validity) it is argued that Web experiments lack this important foundation of experimental psychology. As a result of the first issue, it often proves to be problematic to convince University Review Panels to give permission when there is little insight in the environment in which participants tend to do these experiments. As a result of the second issue, some high-impact journals made it a policy decision not to publish Web-based studies, as such discouraging Web experiments to be performed (cf. Honing & Reips, 2008). Nevertheless, it is important to stress that if an effect is found - despite the limited control in Web-based experiments over the home environment and the technological variance caused by the Internet - then the argument for that effect and its generalizability is even stronger.

The latter issue was recently discussed in an issue of Nature Methods by researchers from the Universities of Giessen and Münster, Germany (see reference below and [modified] figure above). In fact, the authors make the opposite argument! They argue that standardization should be seen as a cause of, rather than a cure for, poor reproducibility of experimental outcomes. Their study showed that environmental standardization can contribute to spurious and conflicting findings in the literature. Würbel and colleagues conclude that to generate results that are most likely going to be reproducible in other laboratories, the strategies to standardize environmental conditions in an experiment should be minimized.

As such the variance caused by Web-based setups (as discussed above) might actually amount to experimental results with a much higher external validity than thought before."

Richter, S., Garner, J., Auer, C., Kunert, J., & Würbel, H. (2010). Systematic variation improves reproducibility of animal experiments. Nature Methods, 7 (3), 167-168. 10.1038/nmeth0310-167

Honing, H., & Reips, U.-D. (2008). Web-based versus lab-based studies: a response to Kendall (2008). Empirical Musicology Review, 3 (2), 73-77.

Simmons, Joseph P., Nelson, Leif D., & Simonsohn, Uri (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant Psychological Science DOI: 10.1177/0956797611417632

Monday, October 10, 2011

A history of music cognition?

One of the pioneers in the field that would come to be called music cognition was H. Christopher Longuet-Higgins (1923-2004). Not only was Longuet-Higgins one of the founders of the cognitive sciences (he coined the term in 1973), but as early as 1971 he formulated, together with Mark Steedman, the first computer model of musical perception. That early work was followed in 1976 with a full-fledged alternative in the journal Nature, seven years earlier than the more widely known, but, according to Longuet-Higgins, less precisely formulated, Generative Theory of Tonal Music of Lerdahl and Jackendoff. In a review in Nature in 1983 he wrote somewhat sourly:

‘Lerdahl and Jackendoff are, it seems, in favor of constructing a formally precise theory of music, in principle but not in practice.’

Although Lerdahl and Jackendorff’s book was far more precise than any musicological discussion found in the leading journals, the importance of formalization cannot be underestimated. Notwithstanding all our musicological knowledge, many fundamental concepts are in fact treated as axioms; musicologists are, after all, anxious to tackle far more interesting matters than basic notions like tempo, meter or syncopation, to name a few. But these axioms are not in actual fact understood, in the sense that we are not able (as yet) to formalize them sufficiently to explain them to a computer. This is still the challenge of ‘computer modelling’ (and of recent initiatives such as computational humanities) – a challenge that Longuet-Higgins was one of the first to take up [Excerpt from Honing, 2011].

Longuet-Higgins, H. C. (1983). All in theory — the analysis of music Nature, 304 (5921), 93-93 DOI: 10.1038/304093a0

Longuet-Higgins, H. C. (1976). Perception of melodies Nature, 263 (5579), 646-653 DOI: 10.1038/263646a0

Honing, H. (2011). The illiterate listener. On music cognition, musicality and methodology. Amsterdam: Amsterdam University Press.

Monday, July 05, 2010

Standardization cause of poor reproducibility?

In the last few years Web-based experiments have become an attractive alternative to lab-based experiments. Next to the advantages of versatility and the ecological validity of the results, Web-based experiments can potentially reach a much larger, more varied and intrinsically motivated participant pool. Especially in the domain of music perception and cognition it is important to probe a wide variety of participants, with different levels of training and cultural backgrounds.

Nevertheless, to get research published that takes advantage of the Internet is not straightforward. An important reason for the conservatism held by some journals in publishing results obtained with Web-based experiments is the issue of replicability. Especially in the fields of experimental psychology and psychophysics there are serious concerns about the (apparent) lack of control one has in Web experiments as opposed to those performed in the laboratory. Where in the lab most relevant factors, including all technical issues, are under control of the experimenter (i.e. have a high internal validity) it is argued that Web experiments lack this important foundation of experimental psychology. As a result of the first issue, it often proves to be problematic to convince University Review Panels to give permission when there is little insight in the environment in which participants tend to do these experiments. As a result of the second issue, some high-impact journals made it a policy decision not to publish Web-based studies, as such discouraging Web experiments to be performed (cf. Honing & Ladinig, 2008; Honing & Reips, 2008). Nevertheless, it is important to stress that if an effect is found - despite the limited control in Web-based experiments over the home environment and the technological variance caused by the Internet - then the argument for that effect and its generalizability is even stronger.

The latter issue was recently discussed in an issue of Nature Methods by researchers from the Universities of Giessen and Münster, Germany (see reference below and [modified] figure above). In fact, the authors make the opposite argument! They argue that standardization should be seen as a cause of, rather than a cure for, poor reproducibility of experimental outcomes. Their study showed that environmental standardization can contribute to spurious and conflicting findings in the literature. Würbel and colleagues conclude that to generate results that are most likely going to be reproducible in other laboratories, the strategies to standardize environmental conditions in an experiment should be minimized.

As such the variance caused by Web-based setups (as discussed above) might actually amount to experimental results with a much higher external validity than thought before.

Richter, S., Garner, J., Auer, C., Kunert, J., & Würbel, H. (2010). Systematic variation improves reproducibility of animal experiments. Nature Methods, 7 (3), 167-168. 10.1038/nmeth0310-167

Honing, H., & Reips, U.-D. (2008). Web-based versus lab-based studies: a response to Kendall (2008). Empirical Musicology Review, 3 (2), 73-77.

Honing, H., & Ladinig, O. (2008). The potential of the Internet for music perception research: A comment on lab-based versus Web-based studies. Empirical Musicology Review, 3 (1), 4-7.

Thursday, March 05, 2009

What makes a theory compelling?*

Karl Popper was a philosopher of science that was very much interested in this question. He tried to distinguish 'science' from 'pseudoscience', but got more and more dissatisfied with the idea that the empirical method (supporting a theory with observations and experiments) could effectively mark this distinction. He sometimes used the example of astrology “with its stupendous mass of empirical evidence based on observation”, but also nuanced it by stating that “science often errs, and that pseudoscience may happen to stumble on the truth.”

Next to his well-known work on falsification, Popper started to develop alternatives to determine the scientific status or quality of a theory. He wrote the complex yet intriguing sentence “confirmations [of a theory] should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.” (Popper, 1963).

Popper was especially thrilled with the result of Eddington’s eclipse observations, which in 1919 brought the first important confirmation of Einstein's theory of gravitation. It was the surprising consequence of this theory that light should bend in the presence of large, heavy objects (Einstein was apparently willing to drop his theory if this would not be the case). Independent of whether such a prediction turns out to be true or not, Popper considered it an important quality of ‘real science’ to make such ‘risky predictions’. Interesting thought, not?

I still find this an intriguing idea. The notion of ‘risky’ or ‘surprising predictions’ might actually be the beginning of a fruitful alternative to existing model selection techniques, such as goodness-of-fit (which theory predicts the data best) and simplicity (which theory gives the simplest explanation). Also in music cognition measures like goodness-of-fit (r-squared, percentage variance accounted for, and other measures from the experimental psychology toolkit) are often used to confirm a theory. Nevertheless, it is non-trivial to think of theories that make surprising predictions. That is, a theory that predicts a yet unknown phenomenon as a consequence of the intrinsic structure of the theory itself. If you know of any, let me know!

K. R. Popper (1963). Conjectures and Refutations. London: Routledge.

* Repeated blog entry from July 23, 2007 (celebrating finalizing a research proposal with Jan-Willem Romeijn on these issues, hoping to be able to address these issues head-on ;-)

Monday, October 01, 2007

What makes a theory of music surprising?

Quite a while ago, a fellow musicologist referred to me as a ‘positivist’. As I was, at that time, not too familiar with postmodern parlance, I thought of it as a compliment (making an association with the Dutch comedy duo De Positivo’s that were sheer optimistic). It turned out that actually the opposite was meant.

Last weekend in Cologne, being invited to speak at the Gesellschaft für Musikforschung, I was reminded of this remark. For some reason the methods associated with positivism, such as those used in the natural and social sciences, still flag a divide in music research between, for instance, the systematic and historically oriented approaches to music. A divide that seems to be fed by a misunderstanding of much of Popper’s ideas on ‘science’ versus ‘pseudoscience’ (see earlier blog).

While the idea of ‘falsification’ is indeed, as Popper showed in some of his later work, not very useful in historical research —as in archeology, a new found manuscript can easily falsify a long established historical theory— this does not make archeology or historical musicology a 'pseudoscience'. In my opinion, it is not so much the inapplicability of the empirical method (and hence the possibility of falsification) for historically oriented musicology, but the apparent resistance to formulate theories that can be tested, that might characterize the discussion. Is it impossible to make a theory about some aspect of (the history of) music that can be tested (or evaluated), independent of empirical evidence?

I like, particularly in this context, Popper’s idea that a theory can be intrinsically compelling or ‘surprising’, even in the absence of empirical evidence. What is intended here is not ‘surprising’ in the sense that a new fact is found that we did not yet knew about, but a prediction that, while we would expect X —given everything we know—, it actually predicts X is not the case, but rather Y. A prediction that is the consequence of a theory (made up of intuition, empirical observations or otherwise) that is violating our expectations based on what we know. I do not see why both historical and systematic musicology could use that as an additional method.

Monday, July 23, 2007

What makes a theory compelling?

Karl Popper was a philosopher of science that was very much interested in this question. He tried to distinguish 'science' from 'pseudoscience', but got more and more dissatisfied with the idea that the empirical method (supporting a theory with observations and experiments) could effectively mark this distinction. He sometimes used the example of astrology “with its stupendous mass of empirical evidence based on observation”, but also nuanced it by stating that “science often errs, and that pseudoscience may happen to stumble on the truth.”

Next to his well-known work on falsification, Popper started to develop alternatives to determine the scientific status or quality of a theory. He wrote that “confirmations [of a theory] should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.” Popper, 1963).

Popper was especially thrilled with the result of Eddington’s eclipse observations, which in 1919 brought the first important confirmation of Einstein's theory of gravitation. It was the surprising consequence of this theory that light should bend in the presence of large, heavy objects (Einstein was apparently willing to drop his theory if this would not be the case). Independent of whether such a prediction turns out to be true or not, Popper considered it an important quality of ‘real science’ to make such ‘risky predictions’.

I still find this an intriguing idea. The notion of ‘risky’ or ‘surprising predictions’ might actually be the beginning of a fruitful alternative to existing model selection techniques, such as goodness-of-fit (which theory predicts the data best) and simplicity (which theory gives the simplest explanation). Also in music cognition measures like goodness-of-fit (r-squared, percentage variance accounted for, and other measures from the experimental psychology toolkit) are often used to confirm a theory.* Nevertheless, it is non-trivial to think of (existing) theories in music cognition that make surprising predictions. That is, a theory that predicts a yet unknown phenomenon as a consequence of the intrinsic structure of the theory itself (If you know of any, let me know!)

Well, these are still relatively raw ideas. I hope to be able to present them in a more digested format next week at the music perception and cognition conference (SMPC) in Montreal. Looking forward to it!

* If you want to read more on this topic, see here.

Music Matters | A blog on music cognition