Thursday, May 23, 2013

Mathematical Theorems

Let’s play a game. Suppose you start with the following string:


And you can transform this string according to four rules:

  1. You can append an U to the end of every string that ends with an I;
  2. Ever substring of the form Mx (where x is any other substring) can be replaced by Mxx;
  3. Every instance of III can be replaced with an U;
  4. Every instance of UU can be erased from the string.

So, for instance, the following transformations are valid:

MI → MII (Rule 2)

MII → MIIII (Rule 2)


MIIIIU → MIUU (Rule 3)

MIUU → MI (Rule 4)

You can, as you see, apply the rules in any order to transform your string. Now the question is: “Starting with the string MI and using only those four rules, is it possible to arrive at the string MU?”


Mathematically speaking, this game is a simple case of the derivation of a mathematical theorem. A mathematical theorem is nothing more than something that has been shown to be true. In this game, each string you can produce starting with MI and applying the rule is a theorem in and of itself.

Generally, in any mathematical systems you start with a set of axioms. In the case of the previous game, the only axiom is MI and the rules. The axioms are theorems that are fundamental, or independent. They are assumed to be true before we can start speaking about the mathematical system we want. They are the board of the game. So, for instance, when I was talking about the Peano Axioms, they were the basic theorems that define the Natural Numbers. Without them, we’re not talking about numbers at all.

So the axioms are the stuff that tells you what you’re talking about in the first place. They’re absolutely necessary, in that they’re fundamental. It’s like saying to a friend, “Well, it’s delicious, and red, and I love it!” How on earth are they going to know you’re talking about apples if you don’t tell them? You could be a vampire. You need the axioms to explain what you’re talking about.

Then, you need a set of rules to operate on your axioms. It’s impossible to operate on axioms alone unless some of them are also rules. For instance, in the Peano Axioms we have the Successorship Operation, which is a rule you apply on a Natural Number to get another one (and this rule is defined by the axioms that contain it). And you invent other rules, like repeated succession (which is simply addition) based on many applications of the rules and theorems you know.

Let’s get back to the MU-Puzzle above. Suppose I use numbers to represent the number of Is in a given theorem. So, for instance, the theorem M3U is the same as MIIIU. Then repeated applications of the above four rules gives me a new rule, that’s not independent from them but can be used as a heuristic: for any number n > 3, MnU → M(n-3)U.

Let’s give a practical example, again with the Natural Numbers. Peano Axiom 7 states that if S(m) = S(n) then m = n. Since we know that addition is just repeated succession, we can prove the heuristic rule that for all Natural k, m + k = n + k. That’s just applying Axiom 7 (which is in fact also Rule) k times.

Based on this intuitive understanding above, then, we can talk about what it means to prove a theorem:

A theorem T is provable within a mathematical system if and only if you can arrive at T starting from some theorem A by applying any number of rules and other theorems known to be true sequentially on A.

Once you prove a theorem, you can use it to prove other theorems. And you see, this agrees with the definition I gave when talking about Mathematical Proofs: a mathematical proof is a set of steps which are known to produce true things from other true things applied sequentially on a statement known to be true.

I have given many examples of this. For instance, when talking about Addition and its properties, I used the Peano Axioms and the rule I defined when talking about Addition itself to prove those properties. And with that, I proved rather trivial things, like that m + n = n + m. And yet, now that I proved it, I can use it to prove other things, like I did with Multiplication.

A mathematical theorem is a statement that’s known to be true. It can be used to prove other statements. So mathematical proofs, even the most complicated ones, are in fact just the result of the repeated application of the known theorems.


Based on this intuitive and technical understanding of what a theorem is, I’m going to just show an interesting theorem that’s easily proved: there is an infinite number of prime numbers.

Now, primes have all sorts of really neat properties of which I’m going to speak at some point. But I want to prove that there is an infinite number of primes. I will use a method of proof called proof by contradiction: I’ll assume something is true (i.e. postulate an axiom that I don’t know yet is true) and then I’m going to apply the theorems and rules I know on it. If I arrive at a conclusion I know to be false, then that means the axiom must be false (because applying theorems and rules on true axioms always yields new theorems). We call these tentative axioms hypotheses.

The axiom I’ll postulate is this: there is a finite number of primes. Those primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, …, p. So, in this scenario, p is the largest prime.

Now I’ll invent a number k, which equals [(2 * 3 * 5 * 7 * 11 * … * p) + 1]. That is, this number equals all primes multiplied together, plus one. Since k > p and p is by hypothesis the largest prime, k must not be prime.

However, if I divide k by each and every prime, I’ll always have a remainder of exactly 1. Since k isn’t divided by any of the primes in the list, it’s either a prime or composite of primes that are not in the list. Since we arrived at a contradiction, that must mean our hypothesis (there is a finite number of prime numbers) is false.

So, you see, I used a theorem in a proof: the theorem that every non-prime number is divisible by at least one pair of primes. Applying that on the hypothesis I arrived at a conclusion that was false (according to the hypothesis), which means the hypothesis itself must’ve been false.


So why am I talking about this now? Wasn’t my post about mathematical proofs enough?

Yes, to talk about proofs it was. But here I’m talking about what the proofs are about: theorems. The true things proofs are supposed to prove. And mainly, this post serves as a teaser for the next one: there are theorems in all mathematical systems that are true but cannot be proved within that system no matter which combination of theorems and/or rules you use.

I’m talking about Gödel, yeah, and more fun stuff like that. Seeya next time.


Oh, and by the way, I’m not going to prove to you the MU theorem. Good luck with it, though!