Oct 23

Texdown: LaTeX simplified. A lot.

Like all students, I hate LaTeX. It's old and outdated and it looks funny and it smells and... aww, I can't go on. I love LaTeX. I love the way it's a standard, and I love the way Leslie Lamport's book about LaTeX is illustrated with anthropomorphic animals. Most of all I love the look of the PDFs it generates.

Despite being awesome, LaTeX shares a couple of annoying characteristics with HTML:

  1. Like HTML, it's hard to syntax highlight. LaTeX is really a very flexible programming language, and that means that doing syntax highlight "right" means you have to do something pretty-much equivalent to actually running LaTeX. Here's an example of Vim doing it wrong:

    This is from Vim. Emacs is a bit better, but not perfect -- I'll get to that below. The point here is that Vim has dutifully syntax highlighted all the LaTeX commands. But is that really what you want? For me, the answer is usually "no". The trouble with syntax highlighting in markup languages is that usually the syntax is much less important than the semantics but, because the markup commands are all mixed up in the text, highlighting the markup just makes the text even harder to read. I want "semantics highlighting".

  2. Like HTML, it's really verbose. Have a look at that picture above. When I read that my actual thought process goes something like: "Syntax highlighting LaTeX is textit -- oh, wait, italics -- hard." Inline markup is distracting, so the less of it the better.

In the land of HTML, where every problem has been solved, these problems have been solved with specialised markup languages like Markdown, reStructuredText, and a million subtly-different Wiki formats. All these little languages work the same way: someone made a list of the top 15 or 20 things that you would want to do with text, and then created a language which lets you do those things really easily. So in Markdown, for example, you do "emphasis" (almost certainly rendered as italics) *like this*. This is an improvement over standard HTML <em>italics</em>, and a vast improvement over LaTeX \textit{italics}.

The other cool thing about these little languages is that they are easy to make meaningful highlight modes for. In Markdown, * means "use emphasis", and it's the only thing that could possibly mean "use emphasis" in that document (unless you drop down to HTML). So you can quite easily add a highlight mode for Markdown emphasis in your text editor, and it will be right 100% of the time.

So I took this idea, ran with it, and wrote Texdown. Texdown has nothing to do with Markdown except sharing a name and the idea of replacing a complex language with a miniature one. There is syntax for common LaTeX things -- cite, ref, textit, textbf, texttt, and so on. It comes with a Vim highlight mode.

There is also a very simple extension system, written in Python, which lets you define custom "block extensions". This is best described by example. I frequently want to insert small code snippets into my writing. I normally do this using a LaTeX verbatim environment. But for various reasons (pagination, ease of reference) I tend to end up wanting the code inside a labelled figure. Then, of course, you can't really have a figure without a caption. So I end up writing quite a lot of mark-up every time I want to talk about some code. In Texdown, any contiguous collection of indented lines is called a "block". If you annotate the first line of the block by ending it with "!!macroname", then your custom function, macro_macroname, will be called during parsing with the contents of the block. Here's how it looks for code descriptions:

This calls the macro named "floatcode". You will see that the macro name in the above is coloured rather faintly, making it difficult to notice. This is because you usually will need to write this once and not change it. Here is the LaTeX code produced:

There are a couple of other ways to extend Texdown. Full details are on the Texdown main page.

Preemptively-answered questions:

Q. Where is the source code again?

A. lardcave.net/texdown for the information page, code.nyloncactus.com/hg/texdown for the source.

Q. I use Emacs, which has an actually decent LaTeX mode. So I don't need this, right?

A. Maybe you do, and maybe you don't. Emacs is much better at this than Vim, but you still get an awful lot of clutter -- it makes guesses about what stuff inside LaTeX formatting commands should look like, but it doesn't remove the commands themselves. If you're happy with that, great.

Q. I use LyX / some other WYSIWYG editor. So I don't need this, right?

A. Right. If you like LyX. I couldn't bear it when I tried it, but that was ages ago and it has (surely) improved.

Q. Why didn't you use Restructured Text?

A. I don't like the syntax.

Q. So this is a markup language on top of a markup language on top of a markup language?

A. Yes, but PostScript is also a (programmable, extensible) markup language. It's turtles all the way down.

Q. I like the concept, but I don't like Texdown very much.

A. In that case I have you half-brainwashed already, and if you are a more-than-casual LaTeX user you should definitely try out a couple of other macro languages for LaTeX, or write your own. It's worth it. Texdown solves a very personal set of problems, so it's definitely not for everybody, but the minilanguages-for-LaTeX concept is much bigger than Texdown.

As far as other markup languages go, you may be interested in looking at reStructuredText or asciidoc.

Q. You only did this because you don't know LaTeX well enough to make it look nice!

A. Yes.

Q. All of this stuff could be done using LaTeX macros and it would be just as easy to edit!

A. Maybe. I doubt it.

Q. Script language programmer!

A. LaTeX nerd!