The
Theory of Theories
You
know what they say about theories: everybody’s
got one. In fact, some people
have a theory about
pretty much everything. That’s not one Master Theory of
Everything, mind you…that’s a separate theory about every little
thing under the sun. (To have a Master Theory, you have to be
able to tie all those little theories together.)
But
what is a “theory”? Is
a theory just a story that you can make up about something, being as
fanciful as you like? Or
does a theory at least have to seem like it might be true?
Even more stringently, is a theory something that has to be
rendered in terms of logical and mathematical symbols, and described
in plain language only after the original chicken-scratches have
made the rounds in academia?
A
theory is all of these things.
A theory can be good or bad, fanciful or plausible, true or
false. The only firm
requirements are that it (1) have a subject, and (2) be stated in a
language in terms of which the subject can be coherently described.
Where these criteria hold, the theory can always be
“formalized”, or translated into the symbolic language of logic
and mathematics. Once
formalized, the theory can be subjected to various mathematical
tests for truth and internal consistency.
But
doesn’t that essentially make “theory” synonymous with
“description”? Yes.
A theory is just a description of something.
If we can use the logical implications of this description to
relate the components of that something to other components in
revealing ways, then the theory is said to have “explanatory
power”. And if we can
use the logical implications of the description to make correct
predictions about how that something behaves under various
conditions, then the theory is said to have “predictive power”.
From
a practical standpoint, in what kinds of theories should we be
interested? Most people
would agree that in order to be interesting, a theory should be
about an important subject…a subject involving something of use or
value to us, if even on a purely abstract level.
And most would also agree that in order to help us extract or
maximize that value, the theory must have explanatory or predictive
power. For now, let us
call any theory meeting both of these criteria a “serious”
theory.
Those
interested in serious theories include just about everyone, from
engineers and stockbrokers to doctors, automobile mechanics and
police detectives. Practically
anyone who gives advice, solves problems or builds things that
function needs a serious theory from which to work.
But three groups who are especially interested in serious
theories are scientists, mathematicians and philosophers.
These are the groups which place the strictest requirements
on the theories they use and construct.
While
there are important similarities among the kinds of theories dealt
with by scientists, mathematicians and philosophers, there are
important differences as well.
The most important differences involve the subject matter of
the theories. Scientists
like to base their theories on experiment and observation of the
real world…not on perceptions themselves, but on what they regard
as concrete “objects of the senses”.
That is, they like their theories to be empirical.
Mathematicians, on the other hand, like their theories to be
essentially rational…to be based on logical inference
regarding abstract mathematical objects existing in the mind,
independently of the senses. And
philosophers like to pursue broad theories of reality aimed at
relating these two kinds of object.
(This actually mandates a third kind of object, the
infocognitive syntactic operator…but another time.)
Of
the three kinds of theory, by far the lion’s share of popular
reportage is commanded by theories of science.
Unfortunately, this presents a problem.
For while science owes a huge debt to philosophy and
mathematics – it can be characterized as the child of the former
and the sibling of the latter - it does not even treat them as its
equals. It treats its
parent, philosophy, as unworthy of consideration.
And although it tolerates and uses mathematics at its
convenience, relying on mathematical reasoning at almost every turn,
it acknowledges the remarkable obedience of objective reality to
mathematical principles as little more than a cosmic “lucky
break”.
Science
is able to enjoy its meretricious relationship with mathematics
precisely because of its queenly dismissal of philosophy.
By refusing to consider the philosophical relationship
between the abstract and the concrete on the supposed grounds that
philosophy is inherently impractical and unproductive, it reserves
the right to ignore that relationship even while exploiting it in
the construction of scientific theories.
And exploit the relationship it certainly does!
There is a scientific platitude stating that if one cannot
put a number to one's data, then one can prove nothing at all.
But insofar as numbers are arithmetically and algebraically
related by various mathematical structures, the platitude amounts to
a thinly veiled affirmation of the mathematical basis of knowledge.
Although
scientists like to think that everything is open to scientific
investigation, they have a rule that explicitly allows them to
screen out certain facts. This
rule is called the scientific method.
Essentially, the scientific method says that every
scientist’s job is to (1) observe something in the world, (2)
invent a theory to fit the observations, (3) use the theory to make
predictions, (4) experimentally or observationally test the
predictions, (5) modify the theory in light of any new findings, and
(6) repeat the cycle from step 3 onward.
But while this method is very effective for gathering facts
that match its underlying assumptions, it is worthless for gathering
those that do not.
In
fact, if we regard the scientific method as a theory about the
nature and acquisition of scientific knowledge (and we can), it is
not a theory of knowledge in general.
It is only a theory of things accessible to the senses.
Worse yet, it is a theory only of sensible things that have
two further attributes: they are non-universal and can therefore be
distinguished from the rest of sensory reality, and they can be seen
by multiple observers who are able to “replicate” each other’s
observations under like conditions.
Needless to say, there is no reason to assume that these
attributes are necessary even in the sensory realm.
The first describes nothing general enough to coincide with
reality as a whole – for example, the homogeneous medium of which
reality consists, or an abstract mathematical principle that is
everywhere true - and the second describes nothing that is either
subjective, like human consciousness, or objective but rare and
unpredictable…e.g. ghosts, UFOs and yetis, of which jokes are made
but which may, given the number of individual witnesses reporting
them, correspond to real phenomena.
The
fact that the scientific method does not permit the investigation of
abstract mathematical principles is especially embarrassing in light
of one of its more crucial steps: “invent a theory to fit the
observations.” A
theory happens to be a logical and/or mathematical construct whose
basic elements of description are mathematical units and
relationships. If the
scientific method were interpreted as a blanket description of
reality, which is all too often the case, the result would go
something like this: “Reality consists of all and only that to
which we can apply a protocol which cannot be applied to its own
(mathematical) ingredients and is therefore unreal.”
Mandating the use of “unreality” to describe
“reality” is rather questionable in anyone’s protocol.
What
about mathematics itself? The
fact is, science is not the only walled city in the intellectual
landscape. With equal
and opposite prejudice, the mutually exclusionary methods of
mathematics and science guarantee their continued separation despite
the (erstwhile) best efforts of philosophy.
While science hides behind the scientific method, which
effectively excludes from investigation its own mathematical
ingredients, mathematics
divides itself into “pure” and “applied” branches and
explicitly divorces the “pure” branch from the real world.
Notice that this makes “applied” synonymous with
“impure”. Although
the field of applied mathematics by definition contains every
practical use to which mathematics has ever been put, it is viewed
as “not quite mathematics” and therefore beneath the
consideration of any “pure” mathematician.
In
place of the scientific method, pure mathematics relies on a
principle called the axiomatic method.
The axiomatic method begins with a small number of
self-evident statements called axioms and a few rules of
inference through which new statements, called theorems,
can be derived from existing statements.
In a way parallel to the scientific method, the axiomatic
method says that every mathematician’s job is to (1) conceptualize
a class of mathematical objects; (2) isolate its basic elements, its
most general and self-evident principles, and the rules by which its
truths can be derived from those principles; (3) use those
principles and rules to derive theorems, define new objects, and
formulate new propositions about the extended set of theorems and
objects; (4) prove or disprove those propositions; (5) where the
proposition is true, make it a theorem and add it to the theory; and
(6) repeat from step 3 onwards.
The
scientific and axiomatic methods are like mirror images of each
other, but located in opposite domains.
Just replace “observe” with “conceptualize” and
“part of the world” with “class of mathematical objects”,
and the analogy practically completes itself.
Little wonder, then, that scientists and mathematicians often
profess mutual respect. However,
this conceals an imbalance. For
while the activity of the mathematician is integral to the
scientific method, that of the scientist is irrelevant to
mathematics (except for the kind of scientist called a “computer
scientist”, who plays the role of ambassador between the two
realms). At least in
principle, the mathematician is more necessary to science than the
scientist is to mathematics.
As
a philosopher might put it, the scientist and the mathematician work
on opposite sides of the Cartesian divider between mental and
physical reality. If
the scientist stays on his own side of the divider and merely
accepts what the mathematician chooses to throw across, the
mathematician does just fine. On
the other hand, if the mathematician does not throw across what the
scientist needs, then the scientist is in trouble.
Without the mathematician’s functions and equations from
which to build scientific theories, the scientist would be confined
to little more than taxonomy. As
far as making quantitative predictions were concerned, he or she
might as well be guessing the number of jellybeans in a candy jar.
From
this, one might be tempted to theorize that the axiomatic method
does not suffer from the same kind of inadequacy as does the
scientific method…that it, and it alone, is sufficient to discover
all of the abstract truths rightfully claimed as “mathematical”.
But alas, that would be too convenient.
In 1931, an Austrian mathematical logician named Kurt Gödel
proved that there are true mathematical statements that cannot be
proven by means of the axiomatic method.
Such statements are called “undecidable”.
Gödel’s finding rocked the intellectual world to such an
extent that even today, mathematicians, scientists and philosophers
alike are struggling to figure out how best to weave the loose
thread of undecidability into the seamless fabric of reality.
To
demonstrate the existence of undecidability, Gödel used a simple
trick called self-reference.
Consider the statement “this sentence is false.”
It is easy to dress this statement up as a logical formula.
Aside from being true or false, what else could such a
formula say about itself? Could
it pronounce itself, say, unprovable?
Let’s try it: "This formula is unprovable".
If the given formula is in fact unprovable, then it is true
and therefore a theorem. Unfortunately,
the axiomatic method cannot recognize it as such without a proof.
On the other hand, suppose it is provable.
Then it is self-apparently false (because its provability
belies what it says of itself) and yet true (because provable
without respect to content)! It
seems that we still have the makings of a paradox…a statement that
is "unprovably provable" and therefore absurd.
But
what if we now introduce a distinction between levels of
proof? For example,
what if we define a metalanguage as a language used to talk
about, analyze or prove things regarding statements in a lower-level
object language, and call the base level of Gödel’s
formula the "object" level and the higher (proof) level
the "metalanguage" level?
Now we have one of two things: a statement that can be
metalinguistically proven to be linguistically unprovable, and thus
recognized as a theorem conveying valuable information about the
limitations of the object language, or a statement that cannot be
metalinguistically proven to be linguistically unprovable,
which, though uninformative, is at least no paradox.
Voilà: self-reference without paradox!
It turns out that "this formula is unprovable" can
be translated into a generic example of an undecidable mathematical
truth. Because the
associated reasoning involves a metalanguage of mathematics, it is
called “metamathematical”.
It
would be bad enough if undecidability were the only thing
inaccessible to the scientific and axiomatic methods together. But
the problem does not end there.
As we noted above, mathematical truth is only one of the
things that the scientific method cannot touch.
The others include not only rare and unpredictable phenomena
that cannot be easily captured by microscopes, telescopes and other
scientific instruments, but things that are too large or too small
to be captured, like the whole universe and the tiniest of subatomic
particles; things that are “too universal” and therefore
indiscernable, like the homogeneous medium of which reality
consists; and things that are “too subjective”, like human
consciousness, human emotions, and so-called “pure qualities” or
qualia. Because
mathematics has thus far offered no means of compensating for these
scientific blind spots, they continue to mark holes in our picture
of scientific and mathematical reality.
But
mathematics has its own problems.
Whereas science suffers from the problems just described –
those of indiscernability and induction, nonreplicability and
subjectivity - mathematics suffers from undecidability.
It therefore seems natural to ask whether there might be any
other inherent weaknesses in the combined methodology of math and
science. There are
indeed. Known as the Lowenheim-Skolem
theorem and the Duhem-Quine thesis, they are the
respective stock-in-trade of disciplines called model theory and
the philosophy of science (like any parent, philosophy always
gets the last word). These
weaknesses have to do with ambiguity…with the difficulty of
telling whether a given theory applies to one thing or another, or
whether one theory is “truer” than another with respect to what
both theories purport to describe.
But
before giving an account of Lowenheim-Skolem and Duhem-Quine, we
need a brief introduction to model theory.
Model theory is part of the logic of “formalized
theories”, a branch of mathematics dealing rather
self-referentially with the structure and interpretation of theories
that have been couched in the symbolic notation of mathematical
logic…that is, in the kind of mind-numbing chicken-scratches that
everyone but a mathematician loves to hate.
Since any worthwhile theory can be formalized, model theory
is a sine qua non of meaningful theorization.
Let’s
make this short and punchy. We start with propositional
logic, which consists of nothing but tautological, always-true
relationships among sentences represented by single variables.
Then we move to predicate logic, which considers the
content of these sentential variables…what the sentences actually
say. In general, these
sentences use symbols called quantifiers to assign attributes
to variables semantically representing mathematical or real-world
objects. Such
assignments are called “predicates”.
Next, we consider theories, which are complex
predicates that break down into systems of related predicates; the universes
of theories, which are the mathematical or real-world systems
described by the theories; and the descriptive correspondences
themselves, which are called interpretations.
A model of a theory is any interpretation under which
all of the theory’s statements are true.
If we refer to a theory as an object language and to
its referent as an object universe, the intervening model can
only be described and validated in a metalanguage of the
language-universe complex.
Though
formulated in the mathematical and scientific realms respectively,
Lowenheim-Skolem and Duhem-Quine can be thought of as opposite sides
of the same model-theoretic coin.
Lowenheim-Skolem says that a theory cannot in general
distinguish between two different models; for example, any true
theory about the numeric relationship of points on a continuous line
segment can also be interpreted as a theory of the integers
(counting numbers). On
the other hand, Duhem-Quine says that two theories cannot in general
be distinguished on the basis of any observation statement regarding
the universe.
Just
to get a rudimentary feel for the subject, let’s take a closer
look at the Duhem-Quine Thesis.
Observation statements, the raw data of science, are
statements that can be proven true or false by observation or
experiment. But
observation is not independent of theory; an observation is always
interpreted in some theoretical context. So an experiment in physics
is not merely an observation, but the interpretation of an
observation. This leads
to the Duhem Thesis, which states that scientific
observations and experiments cannot invalidate isolated hypotheses,
but only whole sets of theoretical statements at once.
This is because a theory T composed of various laws {Li},
i=1,2,3,… almost never entails an observation statement except in
conjunction with various auxiliary hypotheses {Aj},
j=1,2,3,… . Thus, an
observation statement at most disproves the complex {Li+Aj}.
To
take a well-known historical example, let T = {L1,L2,L3}
be Newton’s three laws of motion, and suppose that these laws seem
to entail the observable consequence that the orbit of the planet
Uranus is O. But in
fact, Newton’s laws alone do not determine the orbit of Uranus.
We must also consider things like the presence or absence of
other forces, other nearby bodies that might exert appreciable
gravitational influence on Uranus, and so on.
Accordingly, determining the orbit of Uranus requires
auxiliary hypotheses like A1 = “only gravitational
forces act on the planets”, A2 = “the total number of
solar planets, including Uranus, is 7,” et cetera.
So if the orbit in question is found to differ from the
predicted value O, then instead of simply invalidating the theory T
of Newtonian mechanics, this observation invalidates the entire
complex of laws and auxiliary hypotheses {L1,L2,L3;A1,A2,…}.
It would follow that at least one element of this complex is
false, but which one? Is
there any 100% sure way to decide?
As
it turned out, the weak link in this example was the hypothesis A2
= “the total number of solar planets, including Uranus, is 7”.
In fact, there turned out to be an additional large planet,
Neptune, which was subsequently sought and located precisely because
this hypothesis (A2) seemed open to doubt.
But unfortunately, there is no general rule for making such
decisions. Suppose we
have two theories T1 and T2 that predict
observations O and not-O respectively.
Then an experiment is crucial with respect to T1
and T2 if it generates exactly one of the two observation
statements O or not-O. Duhem’s
arguments show that in general, one cannot count on finding such an
experiment or observation. In
place of crucial observations, Duhem cites le bon sens (good
sense), a non-logical faculty by means of which scientists
supposedly decide such issues.
Regarding the nature of this faculty, there is in principle
nothing that rules out personal taste and cultural bias.
That scientists prefer lofty appeals to Occam’s razor,
while mathematicians employ justificative terms like beauty and
elegance, does not exclude less savory influences.
So
much for Duhem; now what about Quine?
The Quine thesis breaks down into two related theses.
The first says that there is no distinction between analytic
statements (e.g. definitions) and synthetic statements (e.g.
empirical claims), and thus that the Duhem thesis applies equally to
the so-called a priori disciplines.
To make sense of this, we need to know the difference between
analytic and synthetic statements.
Analytic statements are supposed to be true by their meanings
alone, matters of empirical fact notwithstanding, while synthetic
statements amount to empirical facts themselves.
Since analytic statements are necessarily true statements of
the kind found in logic and mathematics, while synthetic statements
are contingently true statements of the kind found in science,
Quine’s first thesis posits a kind of equivalence between
mathematics and science. In
particular, it says that epistemological claims about the sciences
should apply to mathematics as well, and that Duhem’s thesis
should thus apply to both.
Quine’s
second thesis involves the concept of reductionism.
Reductionism is the claim that statements about some subject
can be reduced to, or fully explained in terms of, statements about
some (usually more basic) subject.
For example, to pursue chemical reductionism with respect to
the mind is to claim that mental processes are really no more than
biochemical interactions. Specifically,
Quine breaks from Duhem in holding that not all theoretical claims,
i.e. theories, can be reduced to observation statements.
But then empirical observations “underdetermine” theories
and cannot decide between them.
This leads to a concept known as Quine’s holism;
because no observation can reveal which member(s) of a set of
theoretical statements should be re-evaluated, the re-evaluation of
some statements entails the re-evaluation of all.
Quine
combined his two theses as follows.
First, he noted that a reduction is essentially an analytic
statement to the effect that one theory, e.g. a theory of mind, is
defined on another theory, e.g. a theory of chemistry.
Next, he noted that if there are no analytic statements, then
reductions are impossible. From
this, he concluded that his two theses were essentially identical.
But although the resulting unified thesis resembled Duhem’s,
it differed in scope. For whereas Duhem had applied his own thesis
only to physical theories, and perhaps only to theoretical
hypothesis rather than theories with directly observable
consequences, Quine applied his version to the entirety of human
knowledge, including mathematics.
If we sweep this rather important distinction under the rug,
we get the so-called “Duhem-Quine thesis”.
Because
the Duhem-Quine thesis implies that scientific theories are
underdetermined by physical evidence, it is sometimes called the Underdetermination
Thesis. Specifically,
it says that because the addition of new auxiliary hypotheses, e.g.
conditionals involving “if…then” statements, would enable each
of two distinct theories on the same scientific or mathematical
topic to accommodate any new piece of evidence, no physical
observation could ever decide between them.
The
messages of Duhem-Quine and Lowenheim-Skolem are as follows:
universes do not uniquely determine theories according to empirical
laws of scientific observation, and theories do not uniquely
determine universes according to rational laws of mathematics.
The model-theoretic correspondence between theories and their
universes is subject to ambiguity in both directions.
If we add this descriptive kind of ambiguity to ambiguities
of measurement, e.g. the Heisenberg Uncertainty Principle that
governs the subatomic scale of reality, and the internal theoretical
ambiguity captured by undecidability, we see that ambiguity is an
inescapable ingredient of our knowledge of the world.
It seems that math and science are…well, inexact sciences.
How,
then, can we ever form a true picture of reality?
There may be a way. For
example, we could begin with the premise that such a picture exists,
if only as a “limit” of theorization (ignoring for now the
matter of showing that such a limit exists).
Then we could educe categorical relationships involving the
logical properties of this limit to arrive at a description of
reality in terms of reality itself.
In other words, we could build a self-referential theory of
reality whose variables represent reality itself, and whose
relationships are logical tautologies.
Then we could add an instructive twist.
Since logic consists of the rules of thought, i.e. of mind,
what we would really be doing is interpreting reality in a generic
theory of mind based on logic.
By definition, the result would be a cognitive-theoretic
model of the universe.
Gödel
used the term incompleteness to describe that property of
axiomatic systems due to which they contain undecidable statements.
Essentially, he showed that all sufficiently powerful
axiomatic systems are incomplete by showing that if they were not,
they would be inconsistent.
Saying that a theory is “inconsistent” amounts to saying
that it contains one or more irresolvable paradoxes.
Unfortunately, since any such paradox destroys the
distinction between true and false with respect to the
theory, the entire theory is crippled by the inclusion of a single
one. This makes
consistency a primary necessity in the construction of theories,
giving it priority over proof and prediction.
A cognitive-theoretic model of the universe would place
scientific and mathematical reality in a self-consistent logical
environment, there to await resolutions for its most intractable
paradoxes.
For
example, modern physics is bedeviled by paradoxes involving the
origin and directionality of time, the collapse of the quantum wave
function, quantum nonlocality, and the containment problem of
cosmology. Were someone
to present a simple, elegant theory resolving these paradoxes
without sacrificing the benefits of existing theories, the
resolutions would carry more weight than any number of predictions.
Similarly, any theory and model conservatively resolving the
self-inclusion paradoxes besetting the mathematical theory of sets,
which underlies almost every other kind of mathematics, could demand
acceptance on that basis alone.
Wherever there is an intractable scientific or mathematical
paradox, there is dire need of a theory and model to resolve it.
If
such a theory and model exist – and for the sake of human
knowledge, they had better exist – they use a logical
metalanguage with sufficient expressive power to characterize and
analyze the limitations of science and mathematics, and are
therefore philosophical and metamathematical in nature.
This is because no lower level of discourse is capable of
uniting two disciplines that exclude each other’s content as
thoroughly as do science and mathematics.
Now
here’s the bottom line: such a theory and model do indeed exist.
But for now, let us satisfy ourselves with having glimpsed
the rainbow under which this theoretic pot of gold awaits us.
©
2001 by Christopher Michael Langan
(All
Rights Reserved)
|