Victor Hugo compared his multifarious work to the ocean: against an immense horizon of ideas half glimpsed, he launched into successive projects as they floated pêle-mêle on the surface of his imagination, unsure (or so he said) whether he would ever emerge from their depths.
His adventures on the page were more contained. He worked on Les Misérables in two phases over a decade. On each page of his manuscript, he wrote in lines of eight or so words on the right-hand side, using the left for many later revisions and interpolations.
Source: Gallica (right-click to view and magnify the image)
A page at a time, he plunged ahead into the book and then retraced his steps only to go once more beneath the surface, again a page at a time.
What is Hugo?
My reproduction of the manuscript here owes as much to Hugo as to Hugo and it is prompted less by my admiration for a great nineteenth-century writer than the more prosaic question of one might alternate today between work contained in a manuscript notebook and a reproducible version, like a typescript, or a public archive, like a website or a blog.
Hugo is an open-source static site generator — or a tool for organizing individual pieces of ‘content’ into a website, without the need to write these in the markup languages on which the web depends, namely HTML and CSS. It is well though at times tersely documented and is widely used by self-confident developers. It is an alternative, for instance, to WordPress, which is a content management system, also open-source. Jekyll, on the other hand, is an alternative open-source static site generator and was used as Barack Obama’s fundraising platform in the US elections of 2012. This site has been made using Hugo with the Cocoa theme.
The humanities user…
Why might writers in the humanities make use of a system like this? The advantages I see in doing so are three:
- this approach allows ‘content’ to be used and reused in many contexts and for different purposes: thus, Hugo converts a file in a plain-text format called Markdown into one in HTML, which is also plain-text
- it is a means of ensuring the future accessibility of any computer files in which ‘content’ can be stored, something that is less assured in the case of a file in a digital format, like a PDF, which contains binary instructions readable only by a computer (‘machine-readable’) as well as content (it is easy to go from Markdown to PDF, less easy to perform the same conversion in reverse)
- and it allows one take one’s private ‘content’ and make this publicly available in ways that are relatively straighforward — this is specifically what Hugo helps you to do
… and Markdown
Markdown was originally developed by John Gruber and its rationale was to simplify as far as possible the process of formatting a computer file for display, as Gruber says:
The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.
An editor like Brackets allows you both to write and to edit documents in Markdown, and in turn to preview the content as it would appear in print.
Source: Brackets (screenshot)
In other words, a file in Markdown can be rendered by an editor so as to be available for printing, with the structured elements clearly displayed, e.g. where inline markup generates italics.
It can just as readily be re-used as the basis of a webpage, through conversion into HTML with a tool like Hugo. In brief, content can be used and re-used to different ends.
And because it is a plain-text format, unlike a file in a binary format generated using a word-processing application like MS Word, there is practically no limit to the future access to the computer file in which it is contained: each and every element of the file is human-readable and is thus largely exempt from the constaints of particular computer architectures or languages, now or in the future.
Writing in practice: structured documents
Hugo’s manuscript page was visibly organized: he used roman numerals to designate chapters, sometimes wrote their titles in ink of a different colour, indented new paragraphs, in addition to all of separate ways in which he marked new material for insertion. For each new book, he included a list of the chapters of which it was composed, as well as its title.
Needless to say, Markdown implies the production of documents in the form of computer files, rather than manuscript pages. It also calls for the use of a text editor, rather than a word processing program. Another such editor is Atom, an open-source project that is well documented. This is one of the points of contrast between Hugo and WordPress: in the latter, you create content and edit your material using a dedicated interface within a web browser.
Source: Atom (screenshot)
One of the advantages of Markdown is that it is designed to produce structured documents, in which the function of distinct elements is clearly encoded in the file. Thus, a document is made up of ‘content’ and of ‘markup’, the latter consisting of tags and other identifiers that designate component parts of the content. A structured document is in plain text where the content and markup alike are human-readable and are composed using a pre-determined set of characters.
So, you use minimal code to designate a heading and other elements:
# This is a level-one heading
This is a paragraph with a *link* to [Hugo](http://gohugo.io)
Here is an unordered list:
And here is a block quotation:
>En 1815, M. Charles-François-Bienvenu Myriel était évêque de Digne.
>C'était un vieillard d'environ soixante-quinze ans; il occupait le siège
>de Digne depuis 1806.
A paragraph, by contrast, doesn’t bear any specific code in Markdown, but is simply delimited by being followed by a blank line.
This practice is different from how a word-processor is normally used, where one typically applies a format to a given element of a document, e.g. by applying bold to a heading for display, rather than designating its structural function. The advantage of a structured document, where, say, a heading or a list is designated as such, is that information about the structure can be re-used, just as much as the content: a single Markdown file can give rise to a PDF for printing, or to a webpage via Hugo, with the structure being displayed as appropriate in each case. Using Pandoc, a file in Markdown can be converted into Word.
So, when a Markdown file is converted into HTML using Hugo, the program draws on information concerning structural elements and marks up the content accordingly for display in a web-browser, using tags as distinct from Markdown code, and where elements can be nested within other elements:
<h1>This is a level-one heading</h1>
<p>This is a paragraph with a <em>link</em> to <a href="http://gohugo.io">Hugo</a></p>
<p>Here is an unordered list:</p>
<p>And here is a block quotation:</p>
<p>En 1815, M. Charles-François-Bienvenu Myriel était évêque de Digne. C'était un vieillard d'environ soixante-quinze ans; il occupait le siège de Digne depuis 1806.</p>
Because Markdown is structured, it is easy to read in an editor — headings are clearly distinguished from paragraphs, for instance. Because the formatting that denotes the structure is minimal, it is unobtrusive. Because both content and formatting are directly visible in the editor, Markdown is transparent.
In Atom, the right-hand part of the screen displays any headings that the document may contain; in other words, its organization into sections is visible in the Outline tab. Because one can move from section to section using these links, it is possible to use a text editor like this even when writing quite long documents.
Pandoc Markdown is a further version that is well adapted to writing in the humanities, with provision for ordered lists, where items may be marked with uppercase and lowercase letters and roman numerals, in addition to Arabic numerals, and for textual features like small capitals. CommonMark, on the other hand, is a project that is designed to standardize Markdown and it helpfully demonstrates that one can learn Markdown in sixty seconds; it is also possible to experiment with writing in Common Mark and to see how it can be rendered as a document for printing or in HTML, as well as to follow a longer interactive tutorial.
Source: CommonMark (screenshot)
Dedicated tools, like Marked2, allow you to preview a Markdown file and to export it for printing as a PDF; you can also modify the styles used for display of specific elements of the document, e.g. a heading, a block quotation.
Because Markdown is a plain-text format, it doesn’t directly include or represent binary sources, like an image file in
jpeg format, the contents of which are not human-readable; a file in a word-processor, which is itself in binary format, will directly display the image. Rather, Markdown, when used with Hugo, can instruct the program to retrieve these from source:
Again we see that instructions to the computer in a plain-text file are contained in human-readable text throughout, just like content and markup. Webpages, which are written in HTML, are also plain-text, and contain instructions for the retrieval of image files, for instance, in exactly the same way. This means that you can draw on image files, or movie files, or sound files as you write your content and then rely on a program like Hugo to ensure that instructions for the correct retrieval and display of these are duly incorporated into your webpage. An editor like Brackets can equally incorporate an image file into a format for printing, alongside your structured plain text, as you can see in the screenshot above: note that in the Markdown file itself, the image is not displayed.
Going public: open research
Hugo published Les Misérables in 1862, while he was still in exile in Guernsey. It had a colossal impact on its first appearance and has, of course, lent itself to endless adaptations since that time.
To achieve this outcome, it requires you to write all of the content for your site in separate Markdown documents organized usually in a predetermined directory structure, and to use a master, or configuration, file to provide basic information about your site, e.g. its title, its compliance with data protection legislation. It is a static site tool in the sense that what it generates are the individual plain webpages that will make up the site, a factor that contributes to its ease of use. A dynamic generator, by contrast, can produce changing content, e.g. as contained in a database of files, using a dedicated application located at the server to which the user connects to supply content.
A website can be a research tool. It allows you to facilitate open access to your research, by providing links to publications in institutional repositories, or directly to these and to work in progress. A website is a means of maintaining a blog. A website can also be a means of producing reproducible research, with access in addition to the original data on which it is based — as documented in The Plain Person’s Guide to Plain Text Social Science, where the application of a much wider range of tools to the use and reuse of writing and resources is explained in telling detail. A pro-forma site starter kit for use with Hugo also exists, which is called… Victor Hugo.
An interactive site that interacts with archives
These tools also allow you to display some of the very rich and diverse resources that major repositories like Gallica now make available. Two pages from Hugo’s manuscript are displayed above — they are retrieved directly from Gallica:
<div style="display: block; "><iframe style="width: 100%; height: 450.6498040612754px; border: 0;" src="https://gallica.bnf.fr/ark:/12148/btv1b6000941q/f33.item.mini"></iframe></div>
Gallica not only provides access to the photographic facsimile; it will also generate this HTML code for you, allowing you to embed it in your own webpage, by pasting it directly into your Markdown file (because Hugo converts Markdown into HTML, it can just as easily read and reproduce HTML).
Source: Gallica (screenshot)
You select the Sharing option on the left, specify that you wish to display the whole page, select the appropriate dimensions if you wish to do so, copy the code using the button provided, and insert it at the proper point in your document. When a user visits your own page, the relevant content is retrieved at the same time directly from Gallica.
An advantage of having a website of your own developed using tools like these is that it allows you readily to draw on Gallica and other major repositories. The HTML code provided here by Gallica reproduces not only the content but also the control buttons from the original site, allowing the viewer to read on from page to page in the facsimile, as you can see at the head of this webpage. In other words, the embedded content is itself interactive.
Tools and techniques
There is a little more to using Hugo than meets the eye: as well as a plain-text editor like Atom, you must make use of the terminal to pass commands to Hugo and, if you want to create a personal website of your own, you may need to subscribe to a shared hosting service, like Reclaim Hosting. But all that is for another day.