Building Desed, the sed debugger

Written on 2020-04-21.

I made Desed, a sed demystifier and debugger in Rust. This is how.

When we got assigned a homework to build a simple binary search algorithm, I thought I'll do something interesting. One can do such a thing in ten minutes in any sensible language. But most of things modern programming languages provide are in fact unnecessary. Convenient I/O handling? Asynchronous code? Arrays? Numbers? We can do without any of this.

Sed is a "stream editor for filtering and transforming text". How could anyone use this to program? I always thought the same, until mironimous shown me otherwise.

It turns out sed is turing complete, someone even wrote tetris in it.

Sed can substitute text with regular expressions, run substitutions if a condition (another regex) is true, but most importantly, branch somewhere else based on whether last substitution was successful.

This makes sed a turing-complete language. However, if you tried to ever write a program in sed, or even just a more complex filter, it turns out that sed is hard. It's really easy to make a mistake and the one-letter commands don't exactly help. This is why GNU sed added --debugger, which annotates what sed does during execution and sends it to stdout. This annotation looks like this:

COMMAND:   x
PATTERN: 3\n5
HOLD:    \n0 1 3
COMMAND:   b x
COMMAND:   :x
COMMAND:   s/$/%;%90123456789/
MATCHED REGEX REGISTERS
  regex[0] = 3-3 ''
PATTERN: 3\n5%;%90123456789

The sed debug info actually give you just enough information to write a full-featured debugger. We need to take the debug output, parse it and store it. The debug dump contains all states sed has ever been in during execution, so we can build a debugger that supports all the things like stepping forward (and backwards!), displaying variables and so on.

However there is one issue with using just sed --debug to capture debug state. It doesn't tell you source code lines it is currently executing. So I actually had to code a subset of sed in Rust to be able to infer which line is sed currenty executing and proudly display that to user. Thankfully sed is rather simple language, so there aren't many places things could go wrong. I just need to build a map of labels (sed allows one to specify labels and later jump to them) and then keep in mind whether or not was last substitution successful, so I know what to do on conditional jumps.

Sed has pretty good interface deisgn, so I was able to very quickly figure out what to do. Thank you, sed developers, for making sed as accessible as possible.

So I went ahead and build a debugger in Rust. I decided to use tui-rs (with crossterm backend) for making TUI, since it was something I've seen many other projects use and it looked really good. And as it turns out, I made a good choice on this one. While the documentation certainly could use some work, the crate is so well-designed, you can build a working TUI in one afternoon without any prior knowledge whatsoever.

The debugger is available on github and as a rust crate. It supports breakpoints, stepping through code (even backwards) and hot reload, where you can edit your code and reload debugger by pressing one key, keeping all the state of the debugger intact. It actually makes writing sed almost enjoyable.

HN, Lobsters, Tildes

Previous Index Home Next

RSS feed