What
is DOP?
It is a
fortunate coincidence that the P in DOP can be instantiated in various
interesting ways. Standing originally for Data-Oriented
Parsing, DOP has also become known as Data-Oriented
Processing and Data-Oriented Perception.
The underlying idea of DOP is that newly perceived input is understood in terms
of previously perceived input. Or more concretely, DOP analyzes new data by
probabilistically combining fragments from a corpus of previously analyzed
data. This idea has become particularly influential in some of the cognitive
and informatics sciences, such as machine learning and natural language
processing, and DOP can be seen as a general umbrella that covers most of these
approaches.
Although the DOP idea
has led to some very successful models for linguistic
as well as for musical
and visual
processing, it has some cognitive consequences that not everybody has
immediately taken for granted. One consequence is that humans massively store
previous experiences, a view which for a long time has been regarded as highly
controversial. During the last decade, however, a large body of research has
shown that people in fact build a huge fragment memory. In music, people store
an enormous number of musical patterns, and in vision, people have a remarkably
large visual memory, especially with respect to face recognition. In
psycholinguistics, it has been shown that people not only store lexical items,
bigrams and collocations, but also frequent phrases and whole sentences, and
that such units can directly be used for processing new input.
These
insights go clearly against the idea that human perception could be modeled by
a system of rules, i.e. a grammar. "All grammars leak" is the
well-known dictum by Edward Sapir.
There are so many ambiguities, continua and gradient categories in language,
music and vision, that only an approach which takes into account massively
stored previous experiences can accurately model human perception.
So there
is good reason to believe in Data-Oriented Perception. And I also think there
is good reason to extend DOP to other fields of cognitive psychology. For many
cognitive activities, such as manual reaches, arithmetic operations and problem
solving, people store results in memory so that they can be retrieved whenever
needed rather than being computed from scratch (Data-Oriented Psychology). And if
you believe in Jerry
Fodor's dictum that "cognitive science is where philosophy goes when
it dies" then DOP could just as well mean Data-Oriented
Philosophy. And why not Data-Oriented Problem solving, Data-Oriented
Proof theory, etc, etc?
Yet there
is one field where common wisdom has it that a system of "rules"
works so well that a DOP approach seems useless. That field is Physics. What
would Data-Oriented Physics look like? Rather
than laws, we would have a corpus of derivations for all known physical
phenomena ("derivations" describe each step in linking laws to
phenomena). New phenomena can then be explained or predicted by combining
sub-derivations of previous phenomena. But does this make sense if we can do
the same job with laws only? The answer is that we cannot do the same job with
laws only. It is nowadays well known that there are no general (bridge)
principles that link laws to (models of) phenomena. Each physical phenomenon
has its own way of being linked to laws usually via approximation schemes,
corrections, renormalizations, and the like.
Just take
the well-known exponential-decay phenomenon in radioactive processes. This
phenomenon cannot be derived from the equations of quantum mechanics by some
set of general principles. It can only be approximately derived, for example by
a markov approximation over a perturbation expansion of Pauli's equation. But
even before you can do this approximation, you first need to create what Nancy
Cartwright called a "theory-friendly" description of the
phenomenon that will bring it into the theory. You will have to know what
boundary conditions can be used, what normalization procedures are valid, and
the like. Thus the laws of quantum mechanics alone don't predict anything. And
the same even counts for the laws of classical mechanics! In order to fit
Newton's equations of motion to an actual phenomenon such as a pendulum, you
need to know which assumptions and approximations should be made at which steps
in the derivation.
Thus for
each phenomenon you have to figure out how it can be linked to the relevant
laws. Fortunately this does not mean that a resulting link is useless for
understanding new phenomena. As every student of physics knows, once you have
learned how to fit Newton's equations to a number of phenomena you can use
certain derivation steps for a range of other phenomena (for example, parts of
the derivation of the pendulum carry over to oscillators). Thomas Kuhn was on
the right track when he emphasized the importance of "exemplars" in
the training of scientists. And I agree with Ronald Giere that scientists possess a
large collection of exemplars which makes it possible for them to recognize a
new situation as "similar" to previous situations.
It is
exactly here that I think Data-Oriented Physics
should come in. Kuhn's exemplars are DOP's derivations from laws to phenomena,
and Giere's notion of similarity is DOP's analysis mechanism that tries to build
new derivations out of previous derivations. The fewer subderivations you need
to derive a new phenomenon, the more similar this phenomenon is to previous
phenomena. (This has a probabilistic correlate in that fewer subderivations
tend to result in a higher probability for the whole derivation.)
So yes, I
also believe in Data-Oriented Physics. Physics is not just about discovering
laws or models for phenomena, it is about discovering derivations from laws to
phenomena, usually via approximations, corrections, boundary conditions and the
like. Finding these derivations can be hard, but once you have found some of
them, you can productively re-use parts of them for explaining and predicting
new phenomena.
Rather
than a minimalist system of laws, science should be viewed as a
"maximalist" system of known phenomena in the light of which future
phenomena are understood. It was perhaps this what W.V.O. Quine envisaged in 1953 when he
wrote: "As an empiricist I continue to think of the conceptual scheme of
science as a tool, ultimately, for predicting future experience in the light of
past experience."
I'll soon
make a case for Data-Oriented Politics!