Skip to content
Andrew Carter edited this page May 24, 2014 · 11 revisions

The Python Preprocessor

DESCRIPTION

The python preproccessor, also known as pypp, is a text processor that can be used to format files. It is NOT a macro processor. Instead it preserves spacing, and only replaces one string at a time. It leverages python's string formatting operator "%" to make the replacements.

Directives

The processing includes various directives that detail how the output should be constructed.

All #include, #inside, #if*, #for directives create a new scope.

All directives allow leading white-space. If a directive increases in scope, that white space prefixes all lines inside that scope.

#include / #inside

(?P<indent>\s*)[#](?P<directive>include|inside)(?:\s+(?P<name>".*"))?\s*$

Directs the processor to output the contents of another file. If a file name is provided in quotes, then that processor switches to the file. If no name is provided then the processor switches to the last file that had an #inside command. Upon reaching the end of the file, the processor switches back to the previous file that had an #include command.

#define / #local

(?P<indent>\s*)[#](?P<directive>define|local)\s+(?:(?P<level>\d+)\s+)?(?P<name>\w+)\s+(?P<value>".*")?\s*$

Defines a variable to be used for replacement. If define is used name gets set to value for all scopes other than the last level of them (default 0). If local is used, the variable is defined up to and including scope level (the top scope is relative 0).

If value is not supplied, the variable is deleted from those scopes instead.

#if / #ifn / #ifdef / #ifndef

(?P<indent>\s*)[#](?P<directive>ifn?(?:def)?)(?:\s+(?P<name>\w+))?\s*$

Conditional logic. The #if directive checks if a variable is only an empty string, while #ifdef checks if a variable is in scope. The #ifn* directives do the opposite of their counterparts.

#else / #elif / #elifn / #elifdef / #elifndef

(?P<indent>\s*)[#](?P<directive>else)\s*$

(?P<indent>\s*)[#](?P<directive>elifn?(?:def)?)(?:\s+(?P<name>\w+))?\s*$

Checks the other case of conditional logic. The #else directive is active only if the preceding #if directive was not, and similarly #elif* are active if the preceding directive was not and the equivalent #if* statement would be active.

#for

(?P<indent>\s*)[#](?P<directive>for)\s+(?:(?P<name>\w+)\s+)?(?P<value>(?:".*"|\w+))\s*$

Iterates over the value. If value is quoted it is treated as a string, otherwise it is treated as a variable name. If the result is a string, it is interpreted as a literal and then that is iterated over instead.

If name is not supplied, it is expected to be iterating over a iterable of dictionaries, and the local stack frame is updated with all the values of the dictionary. Otherwise if name is supplied then the iterables are stored in name.

#end

(?P<indent>\s*)[#](?P<directive>end)\s*$

This marks the end of control logic, or an #include / #inside.

##

(?P<indent>\s*)[#](?P<directive>\s)(?P<value>.*)$

The processor will process this line once, and then run again on the results instead of outputting it. If the resulting line has line breaks on it, then it will break up all of the lines and process all of them. For instance the yes command could be replicated via #local yes "y%(\n)s##%%(yes)s" ##%(yes)s

If the line is indented, the indentation prefixes all lines of the first result.

call

(?P<indent>\s*)[#](?P<directive>call)\s+(?:(?P<return>\w+)\s*=\s*)?(?P<func>\w+)(?P<args>(?:\s+(?:"(?:[^\\"]|\\.)*"|\w+))*)\s*$

The processor will call the function supplied, with arguments in strings or by variable name. The result of the function will be stored in the return function, if supplied.

Special Variables

'\n'

A newline character.

Setting this variable does not change the newlines of the file.

__INDENT__

The current level of indentation.

Setting this variable causes all lines to be prefixed with the newly set value.

__DATE__

The date the processing started in '%b %d %Y' format.

__TIME__

The time the processing started in '%H:%M:%S' format.

__LEVEL__

The current stack level.

Setting this variable has no effect on #define or `#local# statements.

__FILE__

The path that PyPP used to open the file.

__LINE__

The current line number of the file.

This variable can be set, but only to integral values.

Known Bugs

Currently does not handle escape sequences in strings, and allows quotes inside strings (only checking for the end).