Syntax-Highlighting from a Markdown source in Go using Chroma
In a previous post I described how to create syntax-highlighted HTML from a markdown source using Go. The code for the post can be found here.
However, I recently wrote a few posts about kotlin
and plan to write some stuff about rust
and maybe c
in the future and the library I used for syntax-highlighting so far has very limited support in regards to different languages.
Luckily, there is now a Go library based on the fantastic pygments python syntax highlighting lib. That library is called chroma and this post will show an example of how to create syntax-highlighted HTML from a markdown source using chroma
.
Chroma
is quite powerful. It provides a plethora of different languages and styles to format the code in just the way you want it. It’s also straightforward to use, so working with it has been a pleasure.
For the markdown-to-html conversion we will again use blackfriday.
So let’s get started!
Implementation
The basic structure of the implementation stays the same. We load a markdown
file and convert it to HTML
. Then we will search for the parts containing code and replace them with the highlighted code, which will be generated using chroma
.
The template we will render the code to is the following:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-us" lang="en-us">
<head>
<title>Syntax Highlighting from Markdown with Chroma</title>
{{.Style}}
</head>
<body>
{{.Content}}
</body>
</html>
Content
is where the actual markdown converted to HTML will be rendered and Style
is where we will add the CSS
generated by chroma
for the style we want to use in this example.
But before we get to that, we need to do some steps first, like loading the markdown file:
func main() {
// load markdown file
mdFile, err := ioutil.ReadFile("./example.md")
if err != nil {
log.Fatal(err)
}
Parsing the template file:
t, err := template.ParseFiles("./template.html")
if err != nil {
log.Fatal(err)
}
And converting the markdown to HTML:
// convert markdown to html
html := blackfriday.MarkdownCommon(mdFile)
Now we encounter the first change in the implementation related to chroma
. With the old syntax highlighting, I manually copied in the CSS for the style I wanted. Chroma
has functionality for generating this dynamically built-in:
// write css
hlbuf := bytes.Buffer{}
hlw := bufio.NewWriter(&hlbuf)
formatter := html.New(html.WithClasses())
if err := formatter.WriteCSS(hlw, styles.MonokaiLight); err != nil {
log.Fatal(err)
}
hlw.Flush()
This snippet creates the CSS
for the style we want to use in this example called MonokaiLight
. The content of the hlbuf
will be written to the Style
variable in the template later on.
Alright, so now we get to the actual syntax highlighting part. The idea is to find code-parts of the form:
<pre><code class="language-go">
...some Code...
</code></pre>
Once we find such a code-part using the goquery library, we parse out the language used from the class
and try to syntax-highlight the code inside the <code>
tags, replacing the old content with the new, highlighted content.
We create a replaceCodeParts
function, which takes the converted HTML and returns a string containing the HTML with highlighted code parts.
First, we read in the converted HTML and create a goquery
document from it, which we can use to search for code-parts:
func replaceCodeParts(mdFile []byte) (string, error) {
byteReader := bytes.NewReader(mdFile)
doc, err := goquery.NewDocumentFromReader(byteReader)
if err != nil {
return "", err
}
Then we use a goquery
Selector to find the code-parts we are interested in:
// find code-parts via selector and replace them with highlighted versions
doc.Find("code[class*=\"language-\"]").Each(func(i int, s *goquery.Selection) {
...
})
Now comes the actual highlighting code. First, we parse the language to use and select the correct lexer. Keep in mind that I omitted any error-handling code here, but almost all of the following stops can fail and need to be handled accordingly. The code-example on GitHub has proper error handling included.
class, _ := s.Attr("class")
lang := strings.TrimPrefix(class, "language-")
lexer := lexers.Get(lang)
Now we have the correct lexer
, which is necessary so our code is tokenized correctly. Next up, we do just that, we grab the code
from the Selector
and tokenize it:
oldCode := s.Text()
iterator, _ := lexer.Tokenise(nil, string(oldCode))
Now, all that’s left is to instantiate a formatter
- in our case, we want to output html
, but chroma
provides other options as well. The formatter is the part of chroma
, which actually generates the highlighted output, based on the code input and the used lexer
.
formatter := html.New(html.WithClasses())
b := bytes.Buffer{}
buf := bufio.NewWriter(&b)
formatter.Format(buf, styles.GitHub, iterator)
buf.Flush()
s.SetHtml(b.String())
The above snippet creates the HTML formatter with the WithClasses
option, which means that we don’t want to have inline-CSS, but rather want to use classes. This also means, that we need to include the CSS somewhere (which we did in the beginning of this example already). Then we format the code and write it to our buffer.
Once that is done, the content of the buffer is written to the Selector
, thus replacing the previous content with our new, syntax-highlighted code.
After replacing the code, what’s left is to create a new HTML document to return it to the caller:
new, err := doc.Html()
if err != nil {
return "", err
}
return new, nil
}
Ok, all we have to do now is to call the function and create the output HTML in the main
function:
// replace code-parts with syntax-highlighted parts
replaced, err := replaceCodeParts(htmlSrc)
if err != nil {
log.Fatal(err)
}
// write html output
if err := t.Execute(os.Stdout, struct {
Content template.HTML
CSS template.CSS
}{
Content: template.HTML(replaced),
Style: template.CSS("<style>" + hlbuf.String() + "</style>"),
}); err != nil {
log.Fatal(err)
}
Nothing fancy happening here - we call the function with the HTML input and execute our template
with the above created CSS
and our new HTML
.
That’s it. You can find the full code here.
Conclusion
The chroma
library is fantastic. Back when I created the first implementation of this, I also contemplated just biting the bullet and use pygments
, accepting the python-dependency for my blog-generator, but decided against it despite the limitations of the old implementation.
I’m very happy there is now a native Go option to do full-featured syntax-highlighting and if you’re reading this post, you already see the chroma
version of the blog’s syntax-highlighting in action. :)