Note: This article is in progress. The assignment is due Wednesday the 13th, so you can count on it being done sometime around then!
This problem comes to us by way of Tibor Radó's 1962 paper on the Busy Beaver function. A brief formulation goes like this: Suppose you define the productivity of a Turing machine with n states as the number of 1's written to the Turing machine's tape after it successfully halts. Represent the productivity as a function p(n). Call the machine with n states that maximizes p(n) the Busy Beaver Machine. Radó proved that no Turing machine can compute the value of p(n)^{1}.
Our primary interest in the problem is to find the Busy Beaver Machine for n=5 in order to win a contest for class: who can write the 5state Turing machine that halts and writes the most ones to an initially blank tape?
For an intuition about the magnitudes involved in this problem, previous research turns up the current leader in the
n=5 case at
Pascal Michel's page. This particular machine (found in 1989 by Heiner Marxen and Jürgen Buntrock) produces 4,098 marks before halting after 47,176,870 steps.
Formalizing the problem a bit
The Busy Beaver problem, being a subset of Turing literature, requires a bit further definition. Our Turing machine has the following form:

0 
1 
1 
S_{new} D I_{new} 
… 
… 
n 
The instructions arranged into rows corresponding to the states of the Turing machine, and into columns based on the contents of the cell at the tape's current location. The instruction cell consists of three components: what to write at the current location, which way to move the tape, and what state to transition to next.
First approach: the stochastic method
The stochastic method builds random Turing machines and tests them. Because the
Halting Problem is also not solvable, we run the Turing machine for an arbitrary amount of time and stop the machine if it hasn't stopped itself by then.
The stochastic method is a rather naive method. Since it's not an exhaustive search of the state space, it's not possible to determine whether
p(5) has been maximized or not. The best we can do is run the stochastic machine builder, keeping track of the most performant candidates, until we run out of time for the contest.
It's important to understand the size of the search space. Suppose we examine each cell in the 2×
n table individually and assign it an instruction. We choose S
_{new}, D, and G independently for each instruction, giving 2×2×5 = 20 different operating instructions. Add to that the halt instruction and the total is 21 instructions, for a total of 10
^{21} combinations!
If the stochastic method finds any solution at all, it's by sheer luck. My python process has been running several hours, and the best it's come up with is 5state Turing machine that produces 1 mark on the tape and halts. Obviously, even a human can do better than this!
Intuitions about behavior may improve our chances
Intuition 1: Avoiding Isomorphisms
The intuition here is that much of the state space is populated by clones—that is, equivalent Turing machines. Any machine
M' which is equivalent to the halting machine
M also halts and additionally produces the same number of marks on the tape. If a bruteforce method could avoid generating all the isomorphisms of a halting machine, the state space to be searched might be significantly reduced.
This approach, done without care, may produce a crippling effect on memory usage as each halting machine is stored so the algorithm can avoid producing its isomorphisms. Additionally, if the generator randomly generates new machines and then attempts to prove to itself whether it's an isomorphism of a known machine is likely to be more costly than simply running the machines themselves.
What's needed is an algorithm for enumerating the nonisomorphic machines with
n states that doesn't involve storing history.
Intuition 2: Graph Theory
Enter graph theory. By reformulating the question in terms of graph theory, we can take advantage of the powerful methods and proofs already developed in this domain. To simplify the machine representation, let's only examine the potential instruction sequences instead of the entire machine. Let's build a directed graph of the states. We'll represent the special HALT instruction as an extra node, giving n + 1 nodes to the state graph. We'll represent state transitions with colored edges to indicate that the state transitions are not fungible.
First, some obvious observations on the necessary nature of this graph. The state graph should be connected, with no islands. By this I mean from any node it should be possible to reach the halt node. Otherwise the machine has effectively fewer than
n states, and it's been proven that
p(n) > p(n') where
n > n'. This constraint guarantees that it is at least plausible that the machine will eventually reach the special halt node. Additionally, this helps the machine avoid some obvious infinite loops—for example, a state where both outgoing edges lead back to itself. The graph may be cyclic without necessarily causing the machine to run forever—the state graph doesn't encompass enough information to determine this (and in fact this is a good thingwe're hoping by simplifying the problem we can make it tractable. We already know the problem is not directly soluble, we're just trying to find a practical approximation).
Fixing the state graph is a big win. For a given nstate graph, there are now only O(n
^{2}) machines (in reality, this number is slightly smaller—at least one of the instructions is the halt instruction, which has no second component). The next question in combinatorial analysis is: how many nonisomorphic, connected graphs of n+1 nodes, where n of the nodes have two distinct outgoing edges, and the special halt node has no outgoing edges?
Finally, the last question is: how do we generate these graphs?
Intuition n: The optimal number of HALTs
The question of how many HALT instructions to include in the machine is intuitively related to the bridgecrossing question. Look at it this way: we must start at state 0, and to get the best of the machine we should try to use every state possible.
In some cases, this is an even number of crossings, in some cases, an odd number. This affects the number of HALT instructions that need to be available (or does it? After all, we're not restricted to crossing a bridge only once—we're restricted to only crossing bridges that exist. That is, we can't modify the machine after it starts running).
This most likely will turn out to be a false intuition.
1.
http://plato.stanford.edu/entries/turingmachine/