The Collatz Conjecture: A simple Yet Devious Problem

mathsnumber-theory

A problem by many names

Imagine a hailstone in a storm. As it's thrown around higher in the clouds collecting moisture it grows larger. Eventually it collides with other hailstones, breaking in two, and falls closer to the ground. You may ask - will it ever collide with the Earth?

There is a similar problem in mathematics known by many names - the Collatz Conjecture, the 3x+13x+1 problem, the Hailstone Problem etc. - which is defined as follows. For all positive integers xx where the iterative function gkg^k is defined as

gk+1(x)={gk(x)2,gk(x) is even3gk(x)+1,gk(x) is oddg^{k+1}(x) = \begin{cases} \frac{g^k(x)}{2}, \text{$g^k(x)$ is even} \\ 3g^k(x) + 1 , \text{$g^k(x)$ is odd} \end{cases}

and g1(x)=xg^1(x) = x, is there some iteration kk where gk(x)=1g^k(x)=1? For example consider x=3x=3. Applying the rules we have:

31051684213 \rightarrow 10 \rightarrow 5 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1

After k=7k=7 iterations we arrive at g7(3)=1g^7(3)=1 which then enters a loop.

Does this happen for all numbers?

It's unknown if for all numbers applying these rules will end up at one - making it one of mathematics' most famous open problems! Due to the simple problem statement it's easy to implement in a script and verify for the first few thousand numbers that it holds. In fact it has been verified true for all numbers up to 2682^{68} - a number over twenty digits long. Despite both this and statistical evidence suggesting the conjecture to be true, there is yet a definitive proof that it holds for all numbers.

You may ask - how may the conjecture be false? Well, one way would be finding an integer xx that enters a cycle that does not contain the number one. Another way is there is a number that when applied to the rules will never each one - a number perhaps so large the rate at which it's halved is less than the rate it increases by.

You may also ask is there something special about 3x+13x+1? Would this occur for other rules? It happens there is something special: consider instead applying 5x+15x+1 when xx is odd for x=5x=5 and see what behavior occurs.

Some interesting observations if the conjecture is true

My favorite observation is to think about the conjecture in reverse. Imagine you were to write all positive integers on a piece of paper. For each you draw a line to the next number in the sequence after applying the rules. Because the conjecture is true each number will have a path to the number one. Moreover this path will be unique - there will be no two paths from the same number that lead to 11 (minus the 4214 \rightarrow 2 \rightarrow 1 loop.)

Now suppose instead of walking this path from the number to one, you start at one. For any positive integer there is exactly one path (or application of the inverse rules - either double the number or subtract one and divide by three if the result is an integer) that will generate it. We can write this sequence using two symbols - \uparrow for doubling and \downarrow for subtracting by one and dividing by three - from right-to-left. For example to generate the number three we can represent the sequence as:

3=Collatz3 = \downarrow \uparrow \downarrow \uparrow \uparrow \uparrow \uparrow_{Collatz}

In the same way that there's a unique representation for each number or a unique prime factorization of it, there is a unique way to generate numbers using these rules. And while you may observe you can also generate any number by starting at one and either adding one and repeating or stop, the mystery of the conjecture is why exactly these rules would result in this behavior.

Another observation is the branches that form by applying the reverse rules. For any number xx we can always double it so there will be an infinite number of nodes representing numbers of the form 2nx2^nx. However it's not always the case that x1x-1 is divisible by three and could branch off into a second branch.

The conjecture states that all numbers lie somewhere on an infinite artery of the form 2nx2^nx where xx is odd. Following this artery to xx the rule 3x+13x+1 takes us to some new artery of the form 2nx2^nx'. This repeats until we reach the central artery 2n(1)2^n(1) in which we end up at the number one.

Why is it so difficult to prove?

Given the conjecture's algorithmic formulation one may ask if results from algorithmic complexity theory may provide insight into the problem. In 1972 Conway's Unpredictable Iterations showed a generalization of the Collatz Conjecture could be represented as the halting problem. That is, there is no algorithm that can decide if the number xx will either reach one or infinitely, other than by applying the conjecture's rules and seeing the result.

Another relationship within complexity theory is with the Busy Beaver problem. The nthn^{th} Busy Beaver problem (BB(n)BB(n)) asks, given a Turing machine with nn possible states and an infinite tape filled with zeros, what is a transition table over the nn states that will:

  1. Terminate in a finite number of steps, and
  2. Perform write operations on the tape to write the maximum number of ones (which is defined as BB(n)BB(n))

Essentially BB(n)BB(n) asks for what is the longest running - but will terminate - algorithm using only nn states. Only up to BB(4)BB(4) is known as adding more states allows for more complex algorithms (and thus more iterations) are possible. In 1979 Erdős' Unconventional Problems in Number Theory conjectured a relationship between BB(15)BB(15) and the Collatz Conjecture. In 2024 Sterin and Woods showed that finding BB(15)BB(15) is at least as hard as proving this relationship (preprint).

What does a solution provide?

In closing it is surprising that a simply stated problem seems to be so difficult. Other than being an interesting problem (made even more so by its popularity) there is much consequence of proving the conjecture true or false. Unlike problems such as the Riemann hypothesis which would provide us more insight into the locations of prime numbers, solving the Collatz conjecture doesn't seem to be consequential. However while the implications of a solution itself may not be of much value the new tools, techniques, and insights gained may provide new understandings in surprising areas of mathematics.