“I know, I’ll use regular expressions”

Plain language: The snark can approach to within 3% of the boojum.

Here the “3” in “3%” is an an instance-specific result, hence it makes sense to code something like the following:

The snark can approach to within %d% of the boojum.

But of course, the trailing % sign is not displayed because the % sign is a format specifier for printf-like functions in several languages. What do we do? Double it of course!

The snark can approach to within %d%% of the boojum.

Now it displays correctly! We can now copy-paste the result into LaTeX for publishing.

But wait a minute. Why not ask the program to print the text with LaTex markup directly instead of us having to do copy-paste work? If we send the preceding output straight into the LaTeX file:

The snark can approach to within %d%% of the boojum.

We get the LaTeX output:

The snark can approach to within 3

What happened? Of course, in addition to being a format specifier in printf and friends, % is also a comment character in LaTeX, and blocks everything downstream on that line. To make LaTeX display a % character, it needs to be escaped with a backslash. Let’s escape it then:

The snark can approach to within %d\%% of the boojum.

Still doesn’t work, and it’s a new mode of failure now — we need both the backslash and the percent symbols in the output, and they keep getting in each other’s escape route. OK, so we do both: escape the backslash and double the percent:

The snark can approach to within %d\\%% of the boojum.

Finally, it works! The first backslash protects the second backslash from the printf function, and the doubled percent is perceived as a single percent sign, which makes printf display “\%” in its output, which is of course LaTeX’s input, where the backslash prevents the % from being read as a comment character, and causes it to be displayed.

Where we wanted to be:The snark can approach to within 3% of the boojum.

How we got there: The snark can approach to within %d\\%% of the boojum.

For those of you who don’t know Zawinski’s thoughts on regular expressions:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions”. Now they have two problems.

Advertisements

2 thoughts on ““I know, I’ll use regular expressions”

  1. Glad you started blogging.

    Funnily enough, I was quoting the exact same jwz quote to my colleague who said he wanted a box in the application where he can type regex that can be used to validate input. As for escaping input, lisp has a slightly better way of telling what to evaluate and what not and the backquote to not eval and comma to eval method comes in very handy when writing macros. The trouble starts quickly enough if you nest them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s