Literate Programming with org-babel and noweb

What is literate programming and why do I use it?

I have been experimenting with a concept called literate programming for the past few years with my Emacs configuration, among other things. The core idea was pioneered by Donald Knuth in 1984 and involves writing documents intended for humans to read, essentially documentation, but that can be transformed into running code. The code itself is encoded in the documentation as examples. [^1]

I have been experimenting using this technique in my personal Emacs configuration. [^2] I am using an Emacs major mode tool called org-mode , and a plugin for org mode called org-babel to write my config in the org markup language. I've interspersed the explanation of how my customization should work, with code blocks, like so:

    
    * Require evil-mode package using use-package

I use [[https://github.com/emacs-evil/evil][evil-mode]] to apply Vim-like [[https://unix.stackexchange.com/questions/57705/what-is-a-modeless-vs-a-modal-editor][modal editing]] instead of Emacs' obtuse keybindings.


#+BEGIN_SRC emacs-lisp
  (use-package evil
    :ensure t)
#+END_SRC

    

If you save this to a file, open the file in Emacs, and run org-babel-tangle it should produce a .el file that you can execute with the same name as the file you saved the markup in (but obviously, with the .el extension.)

you can see here that I have included some complex formatting in this example. The asterisk (*) indicates a heading. I have also linked to a few sources, and I've included the code to accomplish the task (in this case installing a plugin) in the output.

In this simple example, you can see the two main benefits to literate programming

  1. The code and the documentation are in the same place

  2. The documentation can be nicely formatted and include helpful links

These two things make literate programming a really helpful tool for writing code. Part of the reason I have opted for this, specifically for my Emacs configs is that while I do tinker in my configs quite a bit, large sections of it are completely static. I set them up once and don't really think about them until there is some kind of problem, sometimes years later.

That means that it's very easy for me to just add something in and completely forget about it until I need to make a change. Literate programming encourages me to shift my mindset from writing code to make Emacs work to writing documentation (for my future self) first and then writing code. That helps me a lot because it helps me keep the discipline of explaining myself.

What is noweb and what problems does it have?

This is all well and good, but there are some problems. The first thing that I ran into is that if I wanted to document a large block of code that made sense as a unit, there wasn't a really good way to do that. For example:

    
    * Factorial function

This is a simple function to identify the factorial of a number.

#+BEGIN_SRC js

  function factorial(number) {
      if(number == 0) return 1;

      return number * factorial(number--);
  }
#+END_SRC

    

If I want to break this function up, I can, but it's ugly. The way that extracting source code from a text file works in org mode you use a tool called org-babel , which does a lot of things but essentially scans the document for code blocks, then concatenates them together in a file in the order it reads them from top to bottom, so I might break this function up like this:

    
    * Factorial function

This is a simple function to identify the factorial of a number.

#+BEGIN_SRC js
  function factorial(number) {
#+END_SRC

We will be implementing this algorithm recursively, so we must first establish our base case to stop the recursion. In this case, the factorial of 0 is always 1, so if the number provided is 0, then we should return 1 and stop.

#+BEGIN_SRC js
  if(number == 0) return 1;
#+END_SRC

If the number provided is not 0, the factorial is defined as the number times the factorial of the previous integer, so we accomplish that by multiplying our current number by the result of recurring into this function again with the precious integer.

#+BEGIN_SRC js
    return number * factorial(number--);
#+END_SRC

And we must remember to close our function's opening `{` with it's pair.

#+BEGIN_SRC js
    }
#+END_SRC


    

but this is pretty ugly. It's strange that we open a function in the first block and then have to close it in a separate code block further below. We could have put that ending curly brace in the same code block as the last line of code, but that would be equally odd to me.

Another problem is that concatenating the code blocks together top to bottom means that you must always present the code in the same order it will be interpreted by the compiler, and that isn't always helpful. For example, my org-mode config block:

    
    
(use-package org
  :ensure t
  :config
  (defvar org-directory nil) ; Set this in your local.org file!
  (defvar org-jira-link "") ; Set this in your local.org file!
  (setq todo-org "todo.org")
  (setq professional-org "professional.org")
  (setq personal-org "personal.org")
  (setq school-org "school.org")
  (setq notes-org "notes.org")
  (setq inbox-org "inbox.org")
  (setq project-org "project.org")
  (setq reviews-org "reviews.org")
  (setq meetings-org "meetings.org")
  (setq interruption-org "interruption.org")
  (setq contact-log-org "contact-log.org")
  (setq one_on_one_topics-org "one-on-one-topics.org")
  (defun org-concat-org-directory (fileName)
    (concat org-directory fileName))
  (defun setup-org-agenda-files ()
    (add-to-list 'org-agenda-files (org-concat-org-directory todo-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory professional-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory personal-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory school-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory notes-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory inbox-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory project-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory meetings-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory interruption-org))
    (add-to-list 'org-agenda-files (org-concat-org-directory contact-log-org)))
  (setup-org-agenda-files)
  `("t" ; A "key" to use as a hotkey in the template selection UI
    "Todo" ; A description for the template
    entry ; A type, usually entry
    (file ,(concat org-directory inbox-org)) ; A function that takes
  					; some input, which must
  					; resolve to a string, so
  					; it must be interpreted!
    "* TODO %?\n  %i\n  %a") ; An actual template string
  (setq org-todo-capture-template
        `("t"
  	"Todo"
  	entry
  	(file ,(concat org-directory inbox-org))
  	"* TODO %?\n  %i\n  %a"))
  (setq org-interruption-capture-template
        `("i"
  	"interruption"
  	entry
  	(file+datetree ,(concat org-directory interruption-org))
  	"* Interrupted by %?\n%t"))
  (setq org-note-capture-template
        `("n"
  	"Note to self"
  	entry
  	(file+headline ,(concat org-directory notes-org) "Note to Self")
  	"* Note: %?\nEntered on %U\n  %i\n  %a"))
  (setq org-contact-capture-template
        `("c"
  	"contact"
  	entry
  	(file+datetree ,(concat org-directory contact-log-org))
  	"* Contacted by: %\\1%?
  					  :PROPERTIES:
  					  :NAME:       %^{Name}
  					  :COMPANY:    %^{Company}
  					  :HEADHUNTER: %^{Headhunter|Y|N}
  					  :SOURCE:     %^{Source|LinedIn|Phone|Email}
  					  :END:"))
  (setq org-one-on-one-capture-template
        `("wo"
  	"one on one topics"
  	plain ; also unsure what plain actually means
  	(file+function ,(concat org-directory one_on_one_topics-org) org-week-datetree)
  	"*** %?")) ; note the 3 asterisks.  Would be nice to figure out how to do that without but eh.
  (setq org-query-capture-template
        `("wQ"
  	"Datebase Query"
  	entry
  	(file ,(concat org-directory inbox-org))
  	"* %\\2%?
  				:PROPERTIES:
  				:DATABASE: %^{database|STATIC_TABLES|TENANTS}
  				:TICKET:   %^{ticket}
  				:TYPE:     %^{type|DATA|POST_MIGRATION}
  				:END:
  				#+BEGIN_SRC sql :tangle %\\2-%\\1-%\\3.txt
  				#+END_SRC
  				"))
  (setq org-jira-ticket-capture-template
        `("wj"
  	"Jira ticket"
  	entry
  	(file ,(concat org-directory inbox-org))
  	,(concat "* TODO %\\1%?
  				[[" org-jira-link "%^{ticket}][%\\1]]")))
  (setq org-meeting-minute-capture-template
        `("wm"
  	"Meeting notes"
  	entry
  	(file+datetree ,(concat org-directory meetings-org))
  	"* %?\n%U\n"))
  (setq org-emacs-tweak-capture-template
        `("e"
  	"Emacs tweak"
  	entry
  	(file+headline ,(concat org-directory school-org) "Emacs Config Changes")
  	"* %?\nEntered on %U\n  %i\n  %a"))
  (setq org-capture-templates
        `(,org-todo-capture-template 
  	,org-note-capture-template
  	,org-interruption-capture-template
  	,org-contact-capture-template 
  	,org-emacs-tweak-capture-template
  	("w" "Templates around office/work stuff")
  	,org-one-on-one-capture-template
  	,org-query-capture-template 
  	,org-jira-ticket-capture-template
  	,org-meeting-minute-capture-template))
  (setq org-agenda-span 14)
  (setq org-refile-targets (quote ((nil :maxlevel . 5)
  				 (org-agenda-files :maxlevel . 5))))
  (setq org-refile-use-outline-path 'file)
  (setq org-log-into-drawer "LOGBOOK")
  (setq org-todo-keywords
        '((sequence "TODO(t)" "WAITING(w)" "|" "DONE(d)" "CANCELED(c)")))
  (setq org-log-repeat nil)
  (defun org-month-datetree()
    (org-datetree-find-date-create (calendar-current-date))
    ;; Kill the line because this date tree will involve a subheading for the week
    (kill-line))
  (defun org-week-datetree()
    (org-datetree-find-iso-week-create (calendar-current-date))
    ;; Kill the line because this date tree will involve a subheading for the day
    (kill-line))
  (defun org-insert-src-block (src-code-type)
    "Insert a `SRC-CODE-TYPE' type source code block in org-mode."
    (interactive
     (let ((src-code-types
  	  '(
  	    "emacs-lisp"
  	    "python"
  	    "C"
  	    "sh"
  	    "js" 
  	    "sql" 
  	    "latex"
  	    "lisp"
  	    "org" 
  	    "scheme" )))
       (list (ido-completing-read "Source code type: " src-code-types))))
    (progn
      (newline-and-indent)
      (insert (format "#+BEGIN_SRC %s\n" src-code-type))
      (newline-and-indent)
      (insert "#+END_SRC\n")
      (previous-line 2)
      (org-edit-src-code)))
  (evil-define-key 'normal org-mode-map (kbd "<localleader> d s") 'org-schedule)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> d d") 'org-deadline)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> s r") 'org-refile)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> s n") 'org-narrow-to-subtree)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> s a") 'org-archive-subtree-default)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> s w") 'widen)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> s h") 'org-promote)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> s l") 'org-demote)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> p") 'org-priority)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> C i") 'org-clock-in)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> C o") 'org-clock-out)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> T T") 'org-todo)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> b t") 'org-babel-tangle)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> i l") 'org-insert-link)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i i") 'org-insert-item)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i t") 'org-set-tags-command)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i T t") 'org-table-create)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i T r") 'org-table-insert-row)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i T c") 'org-table-insert-column)
  (evil-define-key 'normal org-mode-map (kbd "<localleader> i s") 'org-insert-src-block)

  (evil-define-key 'normal org-mode-map (kbd "<localleader> <return>") 'org-open-at-point)

  (evil-define-key 'edit 'org-mode-map (kbd "<M-return>") 'org-insert-item)

  (evil-global-set-key 'normal (kbd "<leader> a o a") 'org-agenda)
  (evil-global-set-key 'normal (kbd "<leader> a o c") 'org-capture))

    

This is a massive code block, and I'd love to present just the use-package call, explain what org-mode is, then do all the configuration separately, perhaps afterward.

Noweb solves this problem. org-babel (the literate programming tool for org-mode) supports a form of Noweb syntax that we can leverage here[^3]!

So to go back to our javascript example, our code block could be rewritten to use the noweb syntax something like this:

    
    
* Factorial function

This is a simple function to identify the factorial of a number.

#+BEGIN_SRC js
  function factorial(number) {
     <<factorial-base-case>>
     <<factorial-recursive-case>>
  }
#+END_SRC

We will be implementing this algorithm recursively, so we must first establish our base case to stop the recursion. In this case, the factorial of 0 is always 1, so if the number provided is 0, then we should return 1 and stop.

#+BEGIN_SRC js :tangle no :noweb-ref factorial-base-case
  if(number == 0) return 1;
#+END_SRC

If the number provided is not 0, the factorial is defined as the number times the factorial of the previous integer, so we accomplish that by multiplying our current number by the result of recurring into this function again with the precious integer.

#+BEGIN_SRC js :tangle no :noweb-ref factorial-recursive-case
    return number * factorial(number--);
#+END_SRC


    

Now the function is syntactically complete in the first block, and the two blocks that follow show (and their documentation explains) how the function is written. I think this is pretty neat, but it has one major problem.

What is the solution to this problem with noweb?

The major problem I see with this is that it involves manually adding a :noweb-ref [name] to each code block. I think it's easy to overdo it with these. The more references you have in your document, the more complexity you are adding to how the tangling process works and the more likely you are to have a mistake somewhere.

Luckily, there is a solution! You can give a ref to things other than just code blocks. You can actually give a ref to a header of a document! If you do, all source blocks underneath that header (and its children) will be concatenated together, then inserted into wherever you referenced them the parent heading's name! For example:

    
    * Factorial function

This is a simple function to identify the factorial of a number.

#+BEGIN_SRC js
  function factorial(number) {
     <<factorial-implementation>>
  }
#+END_SRC

** Factorial implementation
:PROPERTIES:
:header-args: :noweb-ref factorial-implementation 
:END:

We will be implementing this algorithm recursively, so we must first establish our base case to stop the recursion. In this case, the factorial of 0 is always 1, so if the number provided is 0, then we should return 1 and stop.

#+BEGIN_SRC js :tangle no 
  if(number == 0) return 1;
#+END_SRC

If the number provided is not 0, the factorial is defined as the number times the factorial of the previous integer, so we accomplish that by multiplying our current number by the result of recurring into this function again with the precious integer.

#+BEGIN_SRC js :tangle no 
    return number * factorial(number--);
#+END_SRC

    

This will tangle properly if you run org-babel-tangle! For a more robust example, take a look at the org-mode configuration I referenced earlier in my configs .

Footnotes

[^1]: Literate Programming on Wikipedia

[^2]: My emacs config

[^3]: org babel noweb


I am currently in the process of building my own static site generator! You can follow progress on that project here