Turing completeness, weird machines, Twitter, and muddled terminology

First off, an apology to the reader: I normally spend a bit of effort to make my blog posts readable / polished, but I am under quite a few time constraints at the moment, so the following will be held to lesser standards of writing than usual.

A discussion arose on Twitter after I tweeted that the use of the term "Turing-complete" in academic exploit papers is wrong. During that discussion, it emerged that there are more misunderstandings of terms that play into this. Correcting these things on Twitter will not work (how I long for the days of useful mailing lists), so I ended up writing a short text. Pastebin is not great for archiving posts either, so for lack of a better place to put it, here it comes:

Our misuse of "Turing completeness" and "weird machine" is harmful and confusing

1. "Turing completeness"

TC refers to computability (in terms of simulating other computers), and is well-defined there. It means "can simulate any Turing machine". There are a few things to keep in mind about TC:

None of the machines we use day-to-day are TC in the absence of I/O. TC requires infinite memory.
TC arises really quickly; all you need is one instruction that subtracts and performs a conditional branch if not zero instruction. It also arises in quite minimal recurrence equations randomly (see the fact that game of life is Turing-complete).

For the exploitation case, the desirable outcome is "we got integer arithmetic and arbitrary read and write". Turing-completeness says nothing about exploitability as exploitability has little to do with computation and much more to do with "what sort of states are reachable given the possible interactions with my target". These are two very distinct questions.

Font renderers are Turing-complete, Javascript is, and considering that if I/O is present, part of the computation may actually be performed on the attacker side, IMAP is as well.

It is simply a complete mis-use of terms. Even the paper that first used it just used it to say "we eyeballed the instructions we got, and they look like we can probably do everything we'd want to do":

The set of gadgets we describe is Turing complete by inspection, so return-oriented programs can do anything possible with x86 code.

Let's not mis-use a precisely defined term for something else just because saying "TC" makes us feel like we are doing real computer science.

Exploitation is about reachable states (and transitions) much more than about computability.

2. "Weird machines"

Weird machines are one of the most tragically misunderstood abstractions (which, if people understand it properly, helps greatly in reasoning about exploitation).

The point of weird machine research is *not* about showing that everything is Turing complete. The point of weird machine research is that when any finite state automaton is simulated, and when that simulation gets corrupted, a new machine emerges, with it's own instruction set. It is this instruction set that gets programmed in attacks. Constraining the state transitions (and hence the reachable states) of a weird machine is what makes exploitation impossible. The computational power (in the TC sense) is secondary.

The key insight that would have saved us all from the dreary experience of reading 25 tit-for-tat-ROP-and-bad-mitigation papers is that if you do not constrain what the weird machines that emerge on a memory corruption can do, your mitigation is probably not going to help. Most mitigations blacklist a particular weird machine program, without constraining the machine's capabilities.

Weird machines are the proper framework in which to reason about pretty much all exploits that are not side-channel based or missing-auth-based.

Now, unfortunately, the abstraction was well-understood intuitively by a few people that had done a lot of exploitation, but not made precise or put in writing. As a consequence, other researchers heard the term, and "filled it with their imagination": They then used the term wherever they found a surprising method of computation in random things by using them as designed. This made the term even harder to understand - in the same way that nobody will be able to understand "Turing complete" by just reading ROP papers.

I am particularly passionate about this part because I spent a few months putting the informally-defined weird machine onto sound footing and into precise definitions to writing to prevent the term from getting even more muddled :-)

To summarize:

The use of "Turing completeness" in ROP papers is an abuse; the use of that term does not correspond to the use of the term in theoretical CS. If we are going to use rigorous terms, we should make sure we use them correctly. I see students and researchers be confused about the actual meaning all the time now.
The ROP-tit-for-tat-mitigation paper deluge (and the subsequent unhealthy focus on CFI) could have been avoided if people had reasoned about ROP things at the right abstraction levels (weird machines). They would have had the chance, because there were quite a few conversations between the authors of the tit-for-tat-papers and people that understood things properly :-), but my suspicion is that there was no incentive to reason on an abstraction level that messes with your ability to salami-slice more papers.
It was made harder to understand the concept of weird machines because people that had not properly understood weird machines started calling everything that looked vaguely interesting "weird machines". Nobody was sitting on any review boards to stop them :-), so now that term (which also has a precise definition) is used in a wrong manner, too.

My larger point is: We have so much handwavy and muddled terminology in the papers in this field, it is harmful to young researchers, other fields, and PhD students. The terminology is confusing, often ill-defined (what is a gadget?), terms that have a precise meaning in general CS get used to mean something completely else without anybody explaining it (Turing complete).

This creates unnecessary and harmful confusion, and it should be fixed.