Fcd needs several custom LLVM transformation passes to generate decent output. Generating pseudocode happens in four more or less distinct steps:

  1. Lifting machine code to LLVM IR;
  2. “Pre-optimizing” the module using mostly generic passes (GVN, DSE, instruction combining, CFG simplification);
  3. Running the “customizable optimization pipeline”, which has most of the fcd-specific passes, and in which the user may insert passes using the --opt option;
  4. Actually producing pseudocode (which is itself a multi-step operation as well).

Fcd implements several passes that are meant to be run in the customizable optimization pipeline. Notably, these passes serve the purposes of:

  • recovering a function’s actual arguments by analyzing its IR with the active calling convention in mind;
  • simplifying conditions as expressed by the combination of x86 CPU flags;
  • identifying local variables;
  • eliminating dead loads using the MemorySSA analysis (though I assume that this will eventually be done by LLVM);
  • simplifying no-op pointer casts;
  • replacing arithmetic right shifts with sign extension operations.

All of these address a specific problem that is often found in the output of fcd’s x86_64 code lifter. While several of these are fairly simple and straightforward, some passes required a significant debugging effort. The SESE loop pass was, up to now, the most difficult pass to get right.

This is because currently, debugging an optimization pass with fcd is a tedious process when problems can’t be made immediately obvious. If your pass can’t discover itself the problems that it might cause, then the next step is using the --print-after-all command switch inherited from LLVM and analyzing each pass’s input and output to figure out which transformation made things go wrong. This is usually easier with passes further down the pass pipeline, as very often, most of the code has been chewed away. However, when you need to write a pass that should be run early, you’re left with a lot of noise to go through.

Currently, fcd’s x86 front-end extends every integer to 64-bit and only uses 64-bit math. However, things get ugly and messy when the original program used 32-bit maths: the output is riddled with & 2147483647, and some passes fail to identify patterns when inputs are obscured by AND masks. For this reason, I’m working on an “int narrowing” pass, that notably uses LLVM’s DemandedBits analysis to figure out when values don’t actually need to be 64-bits.

Enter bugpoint

Bugpoint is a tool that takes IR code that triggers a bug in a pass, and automaticaly strips it down until everything in the sample is necessary to trigger it. It was written in a way that ensures that it can work without knowing what’s going on with the passes, so it can be used by trained compiler engineers as much as lowly hobbyists like me.

I snubbed bugpoint for the longest time, but this pass that I’m writing seems to be an excellent use case, so I got my hands dirty and set out to make it work.

Joshua Cranmer has a great introduction to using bugpoint with a standalone (i.e. not opt) tool, and this is exactly what we need for fcd.

Generating IR to work with

The first step is to generate IR code ready to be tested. Before today, fcd could only generate an IR dump right after the lifting process and before the “preoptimization pass” (where stable LLVM passes simplify the lifted code as much as they can), which meant that might still be a lot of work to repeat every time bugpoint tries something.

To solve this problem, the --module-in (-m) and --module-out (-n) options can now be specified multiple times. The gist is that the number of times you specify --module-out determines at which decompilation step fcd will dump its module, and the number of times you specify --module-in determines at which step it would resume with that module. In general, a module obtained with -m specified M times should be loaded with -n specified M times as well.

For one occurrence of -n, the module will be dumped right after lifting. For two, it will be dumped after pre-optimizations. For three, it will be dumped after the customizable optimization pipeline. It can’t be specified four times because the next step is the AST production.

For one occurrence of -m, the optimizations will resume just before pre-optimizations. For two, it they will resume just before the customizable pipeline. For three, it will resume just before AST generation. It can’t be specified four times either.

Specifying which passes to run

Another problem with fcd (before now) is that the pass pipeline was fairly inflexible. It had a single customization point about a third of the way in. You could add passes but you couldn’t remove any, so this would also cause bugpoint to waste a lot of time going through passes that are unlikely to be broken.

Fcd now solves this problem by introducing the --opt-pipeline option. This option can be used in three ways:

  • when not specified (or when explicitly set to default), fcd behaves the same as it did before, and still allows specifying more passes with -opt (-O);
  • when set to a string, the string is whitespace-separated into pass names and fcd will run these passes only;
  • when set to the empty string, fcd will start your $EDITOR with a pre-populated list of default passes and allow you to customize it any way you need it.

The second behavior is used with bugpoint.

Using a debug calling convention

Since bugpoint doesn’t know anything about fcd’s calling convention system, it may end up trying to delete instructions that are essential to argument identification. When you want to try against a single pass that doesn’t depend on argument identification, you can get spurious crashes and end up with a module containing a single function whose only instruction is the unreachable terminator.

To work around this issue, fcd now implements the anyarch/noargs calling convention. It doesn’t match any executable but it can be passed as a command-line parameter. This calling convention lets passes assume that functions have no arguments and no returns by performing no analysis, so bugpoint can’t break it.

Running bugpoint

These new enhancements make it possible to use bugpoint to figure out problems with fcd’s custom passes. First, you need to generate a module for bugpoint to play with:

$ fcd -n -n offending-program > offending-program.ll

After this, you need to pass the path of a program that bugpoint will launch, passing the IR file as a parameter. Since this needs to be a path, we can’t just put an fcd invocation with parameters. We have to write a script instead, but it’s fairly straightforward:

fcd -m -m -n -n -n --opt-pipeline="intnarrowing verify" --cc=x86_64/sysv $@

As a side note, this command makes me wish that LLVM’s CommandLine API supported getopt-style short options. -mmnnn looks more natural.

Here, we use -m -m to specify that our module should be passed directly to the custom optimization pipeline; we use -n -n -n to stop before AST generation takes place (since we only care to see if the custom optimization pipeline worked).

--opt-pipeline only has the intnarrowing pass and the verifier pass. The verifier pass, by default, will terminate the process with a non-zero status if verification fails, which is exactly what bugpoint is looking for. (It would also catch crashes, but the pass doesn’t crash.)

Then, bugpoint can be started with this command:

$ bugpoint --compile-custom --compile-command=./fcd.sh offending-program.ll

In my case, bugpoint reduced offending-program.ll into a tiny function:

define void @main() {
  %0 = zext i32 undef to i64
  store i64 %0, i64* undef, align 8
  %1 = sub nsw i64 %0, undef
  %2 = and i64 %1, 2147483648
  br i1 undef, label %"400643", label %"4006f5"

"4006f5":                                         ; preds = %entry
  ret void

"400643":                                         ; preds = %entry

And indeed, when fed this input, the intnarrowing pass produces incorrect output. With a sample this small, it’s much easier to see what’s going wrong.