Leanpub is now (since 2018) using a flavor of Markdown called Markua. It is upwardly compatible with Gruber Markdown with the sole exception of accepting embedded HTML. Markua doesn't. MM4 solves the problem, accepting the Markua view that HTML constructs otherwise unhandled by Gruber markdown can be fit into a simplifying syntactic framework.
The MM4 solution is an author-centric view, allowing the user to transform their current Gruber Markdown on an as-needed basis to Markua. The practice established here adopts the view that neither is the last word in text editing and formatting.
Someday, we'll achieve a fully semantic text format
MM4 is a shell function, wrapping the m4 command, with a set of definitions to emit either Gruber Markdown or Markua with a command-line option. The design interface at the moment is:
$ mm4 --define=_markdown file ... # OR
$ mm4 [--define=_markua] file ...
The square brackets indicate the default format. Either the Gruber (Markdown) or Markua format of the output shows up on the stdout.
This is the mm4
function. It uses the m4
command, a macro processor which convert user definitions mapping text arguments into the desired format
mm4 ()
{
m4 -I $(include) {hacks,underscore,local}.m4 $*
}
The copies of hacks and underscore are files I've picked up from m4 sites which make it suitable for all forms of text, defending m4 from treating programming keywords like include, define, substr,… from being gobbled up by the macro-processor.
The file local.m4 will grow to include a defnition for each covered feature.
At the moment, there are two places where I'm pre-processing text in a near-Gruber format into a working Gruber format. After upgrading my process from near-Gruber to Gruber, it will be a small effort, rewriting one or two text-emitting macros to convert the near-Gruber to Markua. Since gnu m4 already tests it's behavior with builtins gnu and unix it seems reasonable to appropriate markdown and markua in a similar fashion.
First, Gruber doesn't include files from other sources into the output stream. Since I'm writing the ShellFunctions book at the moment, there are places where I'd like to include a shell fragment, followed by it's resulting output. A few years ago I'd picked up a generic include processor, whose syntax admits @include ../path/to/thisfile.suf
for example. Of course, it's a recursive process allowing nested includes.
Also, Gruber doesn't do anything special with the names or labels of it's sections. So, I'd pre-process those as well, turning something like
### Level Three Header
into
<a name="level-three-header">
### Level Three Header
</a>
which allows in-document reference to this section. At the moment, since Markua is a work in progress, here's what this would look like in Markua:
{id: level-three-header}
### Level Three Header
In the next paragraph in the Markua Manual, it says:
> This syntax is intended to be reminiscent of HTML anchor tags, not of h1
titles in Markua
At the moment, I handle this with a shell function labify
, basically an awk script to recognize the leading ##s and emit the prior text above.
Since my threshold of pain is three, these two instances suggest the threshold isn't far off in upgrades to Markdown anticipated by Markua.
At the moment, these repairs are in separate locations in the process, both in terms of implementation language and processes. Here's the function which MaKes the HyperText:
mkht ()
{
: convert a {default, the newest} Markdown into HTML copy
set -- ${1:-$(ls -t *.md| sed 1q)}
set $1 ../ch$(chno)_$1 ${1%.md}.{html,inc}
app_trace $*
backup_lib mklib
newest $3 $1 *.sh *.run $(llib)/references.txt && return
app_trace "about to: markdown $1 > $2"
include $1 | tee $4 | labify | tee $2 | markdown - > $3
}
This is the crucial line:
include $1 | tee $4 | labify | tee $2 | markdown - > $3
The second set
command assigns these positional parameters:
The include process resolves the @include
directives in the *.md file, leaving the result in the *.inc file, adding the name
labels, leaving the result in the parent directory, where it becomes the source of the Leanpub source document, and further runs (Gruber) markdown
on the input, with the result in the local *.html file
The newest
command, a bit of make
process expect the *.html file to be the newest of its arguments, the *.md file, any *.{sh,run} files in the directory and a file of references in the local library.
Using mm4 to combine these two repairs in a single process would look like:
mm4 $1 | tee $2 | markdown - > $3
This not only simplifies combining these two features, but in the spirit of anticipating other features,
The @include
feature is directly absorbed into mm4 with the _include
m4 builtin. If you are somewhat familiar with m4, you know about it's include
builtin. mm4 uses a set of hacks which map all the builtin definitions onto an underscore prefix-named definition. This frees up ordinary and common programming words to not be consumed by the processor.
The labify
feature will have to find it's way into the definitions. It's interface will look like this:
_namify( text to namify)
I decided namify
better fit the usage. A header macro which whose interface is:
_H( [label ,] Heading text)
Since namify
is probably trivial (in Markua, certainly) the header
macro needs more careful specification.
There are two arguments: an optional label, specifying the heading level, followed by the text to display. Here's a first cut at the requirements.
label | means |
---|---|
the very first encounter | Assume First Level |
absent | use current header level |
N, e.g. 1,2,3 | Nth level header |
+, "push" | Increment the level |
-, "pop" | decrement the level by one |
-N | decrement the level by N |
"recall" (speculative) | increment the level counter at this level |
#{1,N) (an RE) | Nth Level, assumes Markdown/Markua |
*{1,N) (an RE) | Nth Level, assumes OrgMode |
A question remains: are the push
and pop
operations, pre- or post- operations. For the moment, I think push is a POST operation, and pop is PRE. It may be possible for a user to specify, or have other words, like "first", "last", at a given level, where "recall" would be an alternate value of "first".
The absent, blank, or empty level would allow a header, named HDR, say to be used list this:
HDR( #, First first-level header)
HDR( Second first-level header)
HDR( Third first-level header)
HDR( Fourth first-level header)
Then, suppose you were about to import a bunch of headers and text from a file discussing 3rd level properties, where these are the headers, with their own text;
HDR( one good thing)
HDR( a better thing)
HDR( the best thing)
HDR( the absolute best)
Simply insert the latter file into the buffer of the first:
HDR( #, First first-level header)
HDR( Second first-level header)
HDR( push, Third first-level header)
HDR( one good thing)
HDR( a better thing)
HDR( the best thing)
HDR( the absolute best)
HDR( pop, Fourth first-level header)
The importance of a consistent header treatment allows the repeated use, not just the copy-paste operation, of this approach
HDR( #, First first-level header)
HDR( Second first-level header)
HDR( push, Third first-level header)
INCLUDE( discussing 3rd level properties.txt)
HDR( recall, Fourth first-level header)
Note, use of "recall" is probably indicated in this case, since the author of the enclosing file needn't know the structure of the included file. This may require storing the current count by file as well as by level.
It appears we'll need a character repeater since text handling doesn't seem to have one. Here is that macro, a bit noisy, because of the @{
... }@
quoting mechanism invented in hacks.m4
_define(@{_repeated}@,@{_
_ifelse(@{$1}@, @{0}@, , @{$1}@, @{1}@, @{$2}@,
@{$2@{}@$0(_decr($1),@{$2}@)}@)}@)