What Just Happened?

Sometimes, my computer freezes for a moment. It might be after I pressed a button or typed some text. It might not have an obvious cause – just a background daemon doing something at its scheduled time. It’s not usually a problem – the system goes back to normal after a few seconds. However, it’s annoying, and it takes away my ability to use my computer as a tool for those few seconds. It’s a tiny usability bug.

However, this bug is harder to fix than most. For starters, there’s not always a clear reason for the freeze. If I just clicked on something, I might assume that the program that received the click is
doing some processing. But is it really, or is it doing blocking I/O and waiting for the kernel to load a fragmented file from disk? Or is it trying to make a network connection and waiting for my (sometimes slow) Internet connection to return? If there’s no obvious cause, the problem is even worse.

This post is part of a new idea I had to blog about things I wish I could do, but I probably will never do because I don’t have that much time. In the ideal case, someone else would take the idea and do it, but at the very least I’ll have them down in bits so I can look back years later and say, “yeah, that was a good idea. Too bad I was busy.”

Today’s idea is a way to debug problems like the ones above. Like any nice open source software developer, I want to be able to either fix programs that are causing problems or send in good bug reports. But with strange bugs like the above, that’s hard. First of all, I don’t really know what program is causing the problem, although in some cases I can make educated guesses. But even if I did know which program was the problem, how should I tell someone to reproduce it? “Run the program for 2 hours or so, then click this button and see if you get a pause. If not, keep clicking.” I think the best way to get information on something like this is to be profiling the code at the time the slowdown happens.

Therefore, the way to find and catch all of the small, subtle performance bugs like that is to be profiling all the time. What I want is a daemon that sits on my system just running oprofile.

To keep its memory usage down, it only needs to keep logs for the last, say, 30 seconds of my computer usage. But when I hit a pause, I want to be able to press a button to save those last 30 seconds of logs to a file somewhere.

Basically, what I want is a “what just happened?” button for my computer. It should include profiling information and enough information about input events that I know what might plausibly have
caused it. It should also know something about inter-process communication and networking, for the same reason. I want to be able to get all of this output in text form, so I can send it as an email to the mailing list of my favorite open source project, or in graph form, if I just want to visualize my own system.

Once you think of it as a “what just happened?” button, other ideas pop into your head. For instance, what was the top function on the call stack in every program during the times we’re logging? I’m not sure if we can even record that, but it would be interesting information.

It is a key feature of the idea that the daemon is always running, because you don’t know when a small pause like that will happen. The button is worthwhile because it lets you look backwards in time to find bugs that you don’t know how to reproduce yet. Therefore it would have to be pretty efficient. I’ve run oprofile, and I haven’t noticed any performance problems, so I’m not worried about that.

How many bugs would this really catch? I don’t know. There’s only one way to find that out. But this project would lower the barrier to finding and fixing hard bugs, which can only be a good thing. It might even make it easier for non-programmers to file useful bug reports, which would be really good. And most importantly, I think it has enough potential to be worth a try.

Advertisements

A JIT for Guile

About two years ago, I started writing a JIT compiler for GNU Guile. I found that the things I thought would be hard were actually pretty easy, and the hardest part was something I thought would not be a problem.

Why JIT?

Programming language implementations fall on a spectrum. On one end are implementations that do very little preprocessing of a program, but are relatively slow to execute it. The slowest implementation would leave each part of the program as text until it had to execute it – there would be no preprocessing, but it would have to do a lot of work for each statement or expression. On the other end are implementations that are very fast to execute programs, but might need a long time to preprocess them before they can do that. The classic example of these are C or Fortran compilers, which do a lot of up-front analysis, but generate programs that run very quickly.

Implementations of high-level languages generally hit points in between those. A slow interpreter might convert each program to a syntax tree, and then interpret each syntax tree. A faster one might convert a program to bytecode and interpret that. (That’s what Guile does right now.) “JIT” can mean many different sorts of implementations, but the sort I’m talking about is the next step beyond interpreting bytecode – take a bytecode program and convert it into machine code that does what the interpreter would do if it were interpreting the bytecode. The resulting program runs faster because when the CPU is executing the bytecode interpreter, it needs to do some indirection on each instruction in order to determine what code to run, but with machine code it doesn’t.

Writing the JIT Compiler

So a JIT compiler for Guile is a program that walks through a string of bytecode and, for each bytecode instruction, writes some machine code to a buffer. Then it packages up that buffer as a function that can be called by other Scheme and C functions. I had two main jobs – write a program to do that (the actual JIT compiler), and modify the rest of Guile to use it.

First of all, the program needed to generate assembly code, so I needed an assembler. I evaluated Nanojit, SpiderMonkey, GNU Lightning, GNU Libjit, and DynASM. Some were well documented and some were poorly documented. SpiderMonkey is actually a full JavaScript implementation, not just an assembler, and it would take a lot of work to adapt it to Guile. Even within the assemblers there was variation in how much work they do vs. the speed of the generated code – Lightning generates machine code by appending blocks of pre-written code, and Libjit actually works with RTL and does register allocation itself. I experimented with Lightning, but settled on Libjit, because it supported several processor architectures, was well documented, and did register allocation – I like being on the fast end of the implementation spectrum.

The structure of the compiler is very simple – it’s a big switch statement, just like an interpreter. Here’s what it looks like:

for (objcodep = SCM_C_OBJCODE_BASE (objcode);
     objcodep - SCM_C_OBJCODE_BASE (objcode) < objcodelen;
     objcodep++)
{
  switch ((enum scm_opcode)*objcodep) {
    case scm_op_nop:
    break; /* the easiest opcode to implement. */
    case scm_op_assert_nargs_ee:
    ... cases for more instructions down here ...
  }
  ip = jit_insn_add(function, ip, ipup);
}

(The second-to-last line is there because it’s important to keep the VM’s instruction pointer up to date, so JITted code can interoperate with interpreted bytecode.) I implemented cases for enough instructions to compile

(lambda () 0)

It’s a simple function that returns zero, but it’s enough for me to check that the JIT compiler works, and enough for me to work on integrating the JIT compiler with the rest of Guile.

Connecting the JIT to Guile

Connecting the JIT to Guile was mostly straightforward. I made the bytecode interpreter call the JIT compiler on each piece of bytecode at the beginning. If the JIT interpreter worked, the bytecode interpreter would use the machine code it generated instead of interpreting the bytecode. I gave it a failure mode – if the JIT compiler failed to compile a function, the bytecode interpreter would work as usual. This let me test the JIT easily even when it couldn’t compile many functions. I stepped through the program a few times with GDB and made sure that it did, in fact, call JIT-generated machine code.

I also needed a place to store this JIT code, so I didn’t have to re-generate it each time I ran the same piece of machine code. The byte code is stored in ‘struct objcode’ objects. They take four bytes, and I didn’t want to make them bigger because that would put more pressure on the CPU cache. Luckily, the old ‘struct objcode’s didn’t need all four of their bytes: depending on the source of the objcode, either the third or the fourth byte would have information, but never both. So I could store all of that information in the third byte, and leave the fourth byte available for native code. With all of these pieces in place, I had a working JIT compiler in Guile.

Everything’s Easy … Almost

At this point I thought all of the technical challenges would be over. All that was left was to expand the JIT compiler to include all of the other bytecode instructions, and I would be done.

Unfortunately, that would be a bad way to set up the JIT compiler. Each part of the switch statement in the JIT compiler contained exactly the same information as the equivalent part of the switch statement in the bytecode interpreter. This would be a gigantic duplication of code, and every time we updated the bytecode format we would have to update both the bytecode interpreter and the JIT compiler. That seemed inelegant, so I wanted to do something better: generate them both from the same source.

I emailed the Guile development mailing list, and we discussed two possibilities: either generate the JIT compiler from the bytecode interpreter, or generate them both from some third source. It sounded to me like the maintainers did not want to generate the bytecode from anything, so that left generating the JIT compiler from C code, which required parsing C.

Current Status

And that is where I left the JIT compiler, almost two years ago. I did not have a C parser, and I still don’t. I would like Guile to have one in the long term anyway, because it could let it generate FFI bindings automatically, but I haven’t written it yet.

However, in recent mailing list posts, it appears that either people have changed their minds, or I was wrong before – it now seems fine to generate the bytecode interpreter and JIT compiler from some third format. There has also been some work by Stefan Tampe on generating C code from an S-expression representation. That means that it might be possible to finally add a JIT to Guile soon.

Coming Soon

So what happened in those two years? Lots of things! Guile has many compilers, and I’ve worked on several of them. But more on that in another post.