Sat
Jul 10

Python snippet: nicer Python tracebacks

Python exceptions are really helpful but they're also rather ugly. Here's a mean one I encountered today. Don't try to interpret it -- it's only here to show off its ugliness:

I wanted it to look more like this:

So I wrote some code to do that. Specifically, the exception formatter I wrote does the following:

  1. Prints only the 5 deepest stack frames (older frames are usually noise)
  2. Prints code from only the 3 deepest stack frames (same reasoning)
  3. Prints as little of the file name as possible -- the .py suffix is removed, as is any directory information above the current working directory
  4. Prints file names and method names lined up, one after the other. Note that the messy traceback above almost does this already, but that's just luck.
  5. If Pygments is installed, syntax-highlights the code.

All of the above is, of course, configurable.

Here's the code:

bettertb.py

Usage:

import bettertb
bettertb.run(your_main_function, arg1, arg2, ...)

This is a pretty quick and dirty work in progress, so email me (nicholas at lardcave) if you have suggestions, fixes, etc. The first major problem is that getting tracebacks is now more rewarding than having the code finish successfully.

Sat
Oct 24

Texdown: LaTeX simplified. A lot.

Like all students, I hate LaTeX. It's old and outdated and it looks funny and it smells and... aww, I can't go on. I love LaTeX. I love the way it's a standard, and I love the way Leslie Lamport's book about LaTeX is illustrated with anthropomorphic animals. Most of all I love the look of the PDFs it generates.

Despite being awesome, LaTeX shares a couple of annoying characteristics with HTML:

  1. Like HTML, it's hard to syntax highlight. LaTeX is really a very flexible programming language, and that means that doing syntax highlight "right" means you have to do something pretty-much equivalent to actually running LaTeX. Here's an example of Vim doing it wrong:

    This is from Vim. Emacs is a bit better, but not perfect -- I'll get to that below. The point here is that Vim has dutifully syntax highlighted all the LaTeX commands. But is that really what you want? For me, the answer is usually "no". The trouble with syntax highlighting in markup languages is that usually the syntax is much less important than the semantics but, because the markup commands are all mixed up in the text, highlighting the markup just makes the text even harder to read. I want "semantics highlighting".

  2. Like HTML, it's really verbose. Have a look at that picture above. When I read that my actual thought process goes something like: "Syntax highlighting LaTeX is textit -- oh, wait, italics -- hard." Inline markup is distracting, so the less of it the better.

In the land of HTML, where every problem has been solved, these problems have been solved with specialised markup languages like Markdown, reStructuredText, and a million subtly-different Wiki formats. All these little languages work the same way: someone made a list of the top 15 or 20 things that you would want to do with text, and then created a language which lets you do those things really easily. So in Markdown, for example, you do "emphasis" (almost certainly rendered as italics) *like this*. This is an improvement over standard HTML <em>italics</em>, and a vast improvement over LaTeX \textit{italics}.

The other cool thing about these little languages is that they are easy to make meaningful highlight modes for. In Markdown, * means "use emphasis", and it's the only thing that could possibly mean "use emphasis" in that document (unless you drop down to HTML). So you can quite easily add a highlight mode for Markdown emphasis in your text editor, and it will be right 100% of the time.

So I took this idea, ran with it, and wrote Texdown. Texdown has nothing to do with Markdown except sharing a name and the idea of replacing a complex language with a miniature one. There is syntax for common LaTeX things -- cite, ref, textit, textbf, texttt, and so on. It comes with a Vim highlight mode.

There is also a very simple extension system, written in Python, which lets you define custom "block extensions". This is best described by example. I frequently want to insert small code snippets into my writing. I normally do this using a LaTeX verbatim environment. But for various reasons (pagination, ease of reference) I tend to end up wanting the code inside a labelled figure. Then, of course, you can't really have a figure without a caption. So I end up writing quite a lot of mark-up every time I want to talk about some code. In Texdown, any contiguous collection of indented lines is called a "block". If you annotate the first line of the block by ending it with "!!macroname", then your custom function, macro_macroname, will be called during parsing with the contents of the block. Here's how it looks for code descriptions:

This calls the macro named "floatcode". You will see that the macro name in the above is coloured rather faintly, making it difficult to notice. This is because you usually will need to write this once and not change it. Here is the LaTeX code produced:

There are a couple of other ways to extend Texdown. Full details are on the Texdown main page.

Preemptively-answered questions:

Q. Where is the source code again?

A. lardcave.net/texdown for the information page, code.nyloncactus.com/hg/texdown for the source.

Q. I use Emacs, which has an actually decent LaTeX mode. So I don't need this, right?

A. Maybe you do, and maybe you don't. Emacs is much better at this than Vim, but you still get an awful lot of clutter -- it makes guesses about what stuff inside LaTeX formatting commands should look like, but it doesn't remove the commands themselves. If you're happy with that, great.

Q. I use LyX / some other WYSIWYG editor. So I don't need this, right?

A. Right. If you like LyX. I couldn't bear it when I tried it, but that was ages ago and it has (surely) improved.

Q. Why didn't you use Restructured Text?

A. I don't like the syntax.

Q. So this is a markup language on top of a markup language on top of a markup language?

A. Yes, but PostScript is also a (programmable, extensible) markup language. It's turtles all the way down.

Q. I like the concept, but I don't like Texdown very much.

A. In that case I have you half-brainwashed already, and if you are a more-than-casual LaTeX user you should definitely try out a couple of other macro languages for LaTeX, or write your own. It's worth it. Texdown solves a very personal set of problems, so it's definitely not for everybody, but the minilanguages-for-LaTeX concept is much bigger than Texdown.

As far as other markup languages go, you may be interested in looking at reStructuredText or asciidoc.

Q. You only did this because you don't know LaTeX well enough to make it look nice!

A. Yes.

Q. All of this stuff could be done using LaTeX macros and it would be just as easy to edit!

A. Maybe. I doubt it.

Q. Script language programmer!

A. LaTeX nerd!

Tue
Oct 6

Passing file descriptors through Unix-domain datagram sockets

This caused me no end of trouble, and Google wasn't much help:

The example consists of two files. socketmaster.c opens a file you pass it and starts listening on a Unix-domain datagram socket. socketslave.c sends a single-character request to socketmaster's datagram socket. Upon receipt of the request, socketmaster sends the file descriptor of the open file to socketslave's datagram socket. Socketslave reads a small amount of data from the file, displays it, and exits. Run socketslave again and you get the next portion of the file, proving that the file pointer is being shared between the (persistent) socketmaster and the (transient) socketslave.

socketmaster.c

socketslave.c

Thanks to the following three links which contain code to do almost, but not quite, what I want: kragen-hacks; blackhats; docs.hp.com.

Sun
Sep 27

Writing asynchronous device drivers with synchronous code

This post is about device drivers and owes a lot to papers written by Leonid Ryzhyk.

Most devices are asynchronous. You tell the device to do something, then some time later it lets you know that it's finished. The typical programming model for asynchronous devices uses callbacks, something like this:

read_bytes():
    write_to_device_register(READ_DATA)

device_irq_handler():
    if(read_from_device_register(STATUS) & DATA_READY)
        ...

The problem with this model is that it's kind of hard to read. It's particularly annoying if you are doing something which involves lots of interaction with the device, such as initialisation:

void init(int state)
{
    register_callback(init, state);
    static int counter;

    try_again:
    switch(state) {
        case 1:
            write_to_device_register(SETUP1, ...);
            break;
        case 2:
            if(there was an error) {
                counter += 1;
                if (counter < 10) {
                    goto try_again;
                } else {
                    die();
                }
            }
            write_to_device_register(SETUP2, ...);
            break;
        case 4:
            handle possible errors;
        default:
            panic();
    }
}

void device_irq_handler(void)
{
    if(callback registered) {
        callback();
    }
}

It would be much easier to read the code if the device behaved synchronously. Then you could just write this:

void init(void) 
{
    int counter;

    for(counter = 0; counter < 10; counter++) {
        write_to_device_register(SETUP1, ...);
        if(no error encountered);
            break;
    }

    if(error_encountered) {
        die();
    }

    write_to_device_register(SETUP2, ...);
    handle possible errors;
}

So how do you make an asynchronous device behave synchronously? There are lots of ways. The most obvious way is simply to poll the device. This may even be feasible for some devices in some situations (for example, when initialising the device), but, in general it's not a feasible option, because the cycles consumed by polling the device could be better spent doing something (anything!) else.

The second way is to use threads. Under this approach, write_to_device_register() would go to sleep on some synchronisation primitive. The IRQ handler would wake up any thread sleeping on the synch primitive. This approach is pretty much what the Linux kernel does.

Threads have their problems: they really require a lot of thinking to make sure you get both correctness and performance out of the device driver you're writing. For example, Linux allows multiple threads to execute device driver code simultaneously. Multi-threaded drivers are signficant source of device driver bugs.

But I wasn't at the stage of writing concurrency bugs. I just didn't want to manage more than one thread. So instead I abused the pre-processor to create the illusion of synchronous device operation while secretly using callbacks.

Here's how that device init function looks using this approach:

EC_FUNCTION(init)
    static int counter;

    EC_START
    for(counter = 0; counter < 10; counter++) {
        EC_WRITE(SETUP1, ...);
        if(no_error)
            break;
    }

    if(error) {
        die();
    }

    EC_WRITE(SETUP2, ...);
    handle_possible_errors();
    EC_END
EC_FUNCTION_END

This has the same block structure as the threading version, and is pretty easy to read (once you get over the yucky pre-processor macros). But what is that tell-tale "static" doing in front of "int counter"?

That's right, this is just syntactic sugar over a callback-based model. After preprocessing, it looks something like this:

void init(struct device_info *info)
{
    static int counter;

    static void *__label = &&label0;
    goto *__label;

label0:
    for(counter = 0; counter < 10; counter++) {
        __label = &&label1;
        device_write(SETUP1, ..., init, info);
        goto end;
label1:
        if(no_error)
            break;
    }

    if(error) {
        die();
    }

    __label = &&label2;
    device_write(SETUP2, ..., init, info);
    goto end;
label2:
    handle_possible_errors();
    __label = &&label0;
end:
    ;
}

Hideous, huh? The good thing is, you only need to get this sort of thing right once, and then you don't need to see it again.

This approach has some really good points: firstly, you get complete control over where you could be pre-empted in a very intuitive way: when you interact with the device. Any internal state manipulation has no chance of being corrupted. It's not perfect, though: as you can see, you can't use local variables. If you do want to make use of local variables, you need to add some code to save and restore the stack -- essentially continuations.

You can download my preprocessor macros here: event_chain.h.

If you find this sort of thing interesting you should look at Leonid Ryzhyk's publications, particularly "Dingo: taming device drivers".

EDIT: This page gives another summary of the same technique (thanks to Benno for the link)

Sat
Aug 1

Recording and replaying events with Android

For benchmarking, I wanted to be able to replay an event sequence. For example, I wanted a test to open the browser, browse to a Web page, scroll around on the page, and then return to the application launcher. Android offers lots of ways to do this, but because I like systems programming (and haven't touched Java in years) I picked quite a low-level one: capturing and injecting Linux input device events.

Android uses the Linux input system in the standard way, which means events from the keyboard, touchpad, and so forth are available at /dev/input/eventX, where X is some number starting from 0. It is quite straightforward to record these events to a file -- each event arrives as a struct input_event, which is (on Android) a 128-byte structure containing a timestamp, an event type, and a payload.

Recording events is easy. What about playing them back? I started dreaming up something that would rename the /dev/input/eventX files and create PTYs for each, but then I discovered that there is a far easier way: thanks to a feature added early in the 2.6 kernel series, you can just inject events by writing struct input_events back to the relevant files!

This ease of debugging really is an advantage of using a fully-fledged kernel rather than a custom or cut-down operating system.

Anyway, the end result is two programs, "record" and "replay". The "record" program reads events from /dev/input/eventX and writes them to a file named /sdcard/events. The "replay" program reads this file and replays the events in real time.

The programs read from and write to /dev/input/eventX, even though /dev/uinput should in theory work -- it doesn't appear to on the ADP1.

Both programs are quite "proof of concept", and probably only work on the HTC Dream for now, but they are very simple and easy to adapt.

Update 30/Dec/2009: Aaron made some changes to "replay" which should give much smoother results on OpenMoko phones.

injectevents.tar.gz (6k)

Mon
Jul 27

GOSUB with Python byte code

This blog post is about some of the terrible ways you can abuse Python byte code: specifically, by using it to implement lightweight function calls ("GOSUB") directly in byte code. Doing this is obviously stupid and disgusting, and (spoiler) regular Python function calls are fast enough to make this hack not worth the effort. But trying it was quite interesting.

We don't need a new namespace, or exception handling, or any of the other nice features that Python functions provide. In fact the only thing we need is support for 1) jumping to an arbitrary location and then 2) returning to wherever we came from later. That is, we need GOSUB.

Jumping to arbitrary locations is very easy, since Python bytecode has a JUMP_ABSOLUTE opcode. Unfortunately the argument to this opcode must be an offset calculated at compilation time -- there is no way to change the address of a JUMP_ABSOLUTE statement from within the bytecode itself. This makes returning to wherever we came from later impossible with JUMP_ABSOLUTE -- unless we only ever call the subroutine from one location, which isn't true in general.

Fortunately, more devious people than I think about these problems: Phillip J. Eby has discovered that a certain opcode, END_FINALLY, normally used at the end of Python finally blocks, does its work by taking the address to return to from the stack. If we put the location we want to jump to on the stack and call END_FINALLY, we can jump wherever we want. Effectively, it's a computed GOTO. Sadly, END_FINALLY was designed as part of the exception handling machinery and not as part of the dodgy-hack-supporting machinery. As a result, the code is a little more complicated than you might think -- but not much.

This hack can be nicely abused to implement GOSUB, like so:

          0 LOAD_CONST               4 (12)
          3 LOAD_CONST               1 (37)
          6 SETUP_LOOP               3 (to 12)
    >>    9 JUMP_ABSOLUTE           14
         12 POP_BLOCK           
         13 RETURN_VALUE        
    >>   14 LOAD_CONST               2 (2)
         17 INPLACE_ADD         
         18 ROT_TWO             
         19 LOAD_CONST               3 (32)
         22 END_FINALLY

Here we are calling a subroutine that adds 2 to the value on the top of the stack. First we push a value to return to after the subroutine finishes onto the stack -- in this case we return to byte 12. Then we push an argument to the subroutine, in this case 37. SETUP_LOOP creates a block for looping or exception handling, and is required to make the END_FINALLY hack work. We then jump directly to the subroutine at byte 14.

In the subroutine we add 2 to whatever is at the top of the stack. Our stack then contains the return address, followed by the result of our calculation. We're about to jump to the return address, so we want that at the top of the stack. ROT_TWO does that for us by swapping the top items at the top of the stack. We then LOAD_CONST a magic value that is part of the END_FINALLY hack -- it is WHY_CONTINUE, a number that has special meaning to the CPython interpreter. It pops WHY_CONTINUE off the stack, which instructs it to continue execution at the value at the top of the stack -- which is now our return value.

Execution then resumes at byte 12, which removes the block we created with SETUP_LOOP, and then returns the value at the top of the stack, which is the return code from our subroutine. In this case the function returns the value 39.

There are some problems with this approach:

  • POP_BLOCK, in addition to removing a block from the block stack, also returns the stack level to the level that it was when SETUP_LOOP was called. So if the subroutine returns any values on the stack you have to be careful. One simple way to deal with this is to have subroutines that take as many arguments as they return (i.e. 1).
  • BREAK_LOOP now has useless (but well-defined) behaviour: it acts as if the subroutine has immediately returned, but doesn't provide any way of letting the caller know what happened. BREAK_LOOP still works if the subroutine sets up an actual (real) loop by pushing another block to the block stack. It should actually be possible to "fix" this by having SETUP_LOOP target an exception-handling block, essentially giving the subroutine a choice of two different locations to return to -- taking the nastiness to the next level!
  • The CPython block stack is only 20 levels deep, which is pretty reasonable if you think about it as being equivalent to 20 levels of indentation. :) This means that as things currently stand, subroutine calls can't be nested very far. There is a very easy way to fix this one, though.
  • All this is an incredibly seedy hack-on-top-of-a-hack. It relies on a side-effect of END_FINALLY that is only guaranteed to work in the canonical C implementations of Python, and even then only versions 2.3 and upward.

Adding an extra opcode would make this much nicer. There was some talk back in 2006 about implementing a JUMP_TOP opcode, which would simply jump to the bytecode position at the top of the stack. Unfortunately, the motivation for adding it was to add a "switch" statement to Python, and nowadays that doesn't seem very likely.

Fortunately, there is no real motivation to use this hack. Running a simple test (10000000 iterations, on a 2.2Ghz MacBook) gives the following results:

MethodTime in seconds
GOSUB2.78
CALL_FUNCTION, hand-written3.00
CALL_FUNCTION, Python3.67

In these results, "GOSUB" is the hand-written GOSUB hack discussed above, and the other two use the regular Python function call mechanism: one with hand-written bytecode, and the other written in regular Python and compiled in the normal way. Even with ten million calls there isn't that huge a speed improvement to make GOSUB worthwhile.

Taking screenshots with an Android ADP1

There are lots of tutorials around the 'net to take screenshots using DDMS. Unfortunately using DDMS makes it difficult to take automated screenshots for testing purposes.

Fortunately Android uses reasonably-standard Unix conventions for access to devices, so you can read the data from /dev/graphics/fb0. The full procedure for getting a screenshot is:

adb pull /dev/graphics/fb0 fb0
python rgb565torgb888.py <fb0 >fb0.888
convert -depth 8 -size 320x480 RGB:fb0.888 fb0.png

Imagemagick (and thus convert) doesn't seem to deal with the 16-bit colour (565) image format, so I wrote a program to expand the data to 24-bit (888) before passing it to convert. You can use mine (rgb565torgb888.py, 311 bytes) or any other one you may find out there. :)