So you want to tweak docutils?

Docutils is a very nice python package that converts text documents (namely: documents formatted in restructured text) to various usefull formats (like latex, open office and so on). Sometimes default docutils converters need a little bit tweaking do fit your needs.

I needed to tweak latex generated by docutils to my specific needs, and I discovered that I can’t find any tutorial on how docutils work intenally, so here is what I found out in my brief encounter with it.

Note

This article is based on my experiences from tweaking docutils myself (I couldn’t find revelant pages in the documentation), so this post is in no way authoritative.

How to tweak output of (for example) rst2latex

You need to create module that defines your Writer that needs to inherit from docutils.writers.latex2e.Writer, latex contents are actually generated by Writer.translator_class that is an instance of NodeVisitor.

Here is an example of my Writer that renders code-block directive as a verbatim block in LaTeX.

# -*- coding: utf-8 -*-

import re

from docutils import nodes

from docutils.writers.latex2e import Writer as LatexWriter, LaTeXTranslator

class TweakedLatexWriter(LatexWriter):

  def __init__(self):
    super().__init__()
    self.translator_class = TweakedTranslator

class TweakedTranslator(LaTeXTranslator):

  def visit_literal_block(self, node):
    if 'code' in node.attributes['classes']:
      # code-block
      self.requirements['upquote'] = '\\usepackage{upquote}'
      self.out.append('\n\\begin{verbatim}\n%s\n\\end{verbatim}\n' % node.astext())
      raise nodes.SkipNode
    super().visit_literal_block(node)

Basics are simple enough: create own writer, and this writer should use your own translator.

How does NodeVisitor works

NodeVisitor works in following way (when rendering contents, there are many other node visitors that work in other ways — I guess).

Visiting a node is implemented in Node.walkabout method.

  1. Revelant method names are generated from node class name. If node class name is foo two method names are generated visit_foo and depart_foo, if such methods are not found on NodeVisitor default method is called.
  2. First visit method is called.
  3. Then walkabout method is called for every child of current node.
  4. depart method is called

You can suppress almost every step of this algorithm by raising a proper exception from visit method. If you want to skip rendering of children nodes just raise SkipNode exception.