Helper to publish HTML documents.

URLs

Do you want to improve this page? Please edit it on GitHub.

Description

The goal of the project is to perform some operations on existing HTML files (like the one produced by a tool like asciidoctor) in order to optimise the published files.

The features include:

  • Move html files (while preserving relative links between files).

  • Reorganize resources (images, javascript, css, …) into sub-folders.

  • Add an hash to the resources name based on their content (to activate client side caching).

  • Combine multiple files into a complete side (based on Antora Default UI) by adding navigation capabilities.

  • Keeping a catalog of the published files (in order to work on redirection later on).

The main entry point is: HtmlPublishHelper.publishHtmlFile(ConfigurationHolder)

Where the ConfigurationHolder parameter can be used to configure the behavior.

A first example

Imagine you have several files (case1/index.html, case2/page1.html, case2/sub/page1.html in the example bellow) that are sharing resources in folders that are somewhere else (assets/ or imgs/)

inputFolder
├── assets
│   ├── empty.js
│   ├── file.css
│   └── page.css
├── case1
│   └── index.html
├── case2
│   ├── page1.html
│   └── sub
│       └── page2.html
└── imgs
    ├── image.png
    └── img.svg

Imagine you want to publish case1/index.html as page.html with the resources used in the page renamed (with a hash based on their content) and grouped into a subfolder (static/).

This configuration object can be used:

ConfigurationHolder config = new ConfigurationHolder()
        .inputRootFolder(Paths.get("inputFolder"))
        .outputRootFolder(Paths.get("outputFolder"))
        .addPage(new ConfigurationPage()
                .input("case1/index.html")
                .output("page.html"))
        .options(new ConfigurationOptions()
                .resourcesRewriteStrategy(RewriteStrategy.SHORT_SHA1_SUFFIX)
                .cssOutputFolder("static/css/")
                .javascriptOutputFolder("static/scripts/")
                .imagesOutputFolder("static/images/")
                .fontOutputFolder("static/fonts/"));

You will get following structure:

outputFolder
├── page.html
└── static
    ├── css
    │   └── file_22ae394.css
    ├── images
    │   ├── image_a61d85b.png
    │   └── img_61336b7.svg
    └── scripts
        └── empty_b716e1f.js

Configuration Holder

The ConfigurationHolder class is holding all the configurations for the publication logic.

Input / output

The inputRootFolder is defining where the files used as input are located. The outputRootFolder is mandatory and defines where the output will be located

Options

  • clearOutputRootFolder: indicates if the outputRootFolder must be deleted before starting the publishing or not.

  • linkToIndexHtmlStrategy: controls how the links to index.html are written. Either as link to the parent folder or as link to the file.

  • pagesBaseFolder: when folders are used, the approach of the path-order project with the pages.yaml can be used to control page ordering. If those yaml files are not in the same folder than the *.html files (imagine keeping them next to your asciiDoc sources) the pagesBaseFolder option can be used to define the relative path where the pages.yaml files can be found.

Resources options

When the resources (images, scripts, css files …) are copied to the output folder, some modifications are applied.

The location is changed and controlled with:

  • imagesOutputFolder for images

  • javascriptOutputFolder for scripts

  • cssOutputFolder for css files

  • fontOutputFolder for

In order to activate client side caching, the url to the resources must be unique, based on the content. This can be controlled with resourcesRewriteStrategy.

Complete site options

When several HTML files are created by a tool like asciidoctor, each of them are independent unit.

With the completeSite option it is possible to modify them to add some navigation capabilities (tree of pages, breadcrumb, next/previous links, …)

In addition when this option is activated:

  • includeDefaultCss: indicates if the default css (site.css) is included during publication of the complete site. See Antora Default UI

  • includeOriginalCss: indicates if the css from the orininal document has to be preserved

  • includeDefaultJs: indicates if the default javascript (site.js) is included during publication of the complete site. See Antora Default UI

  • includeOriginalJs: indicates if the javascript files and the inline <script> sections from the orininal document has to be preserved

  • createToc: controls if the table of content is created during, note that the default javascript is also creating the table of content dynamically

  • siteName: name of the site, if omitted the title is computed depending on the siteHomePath value. If the value is referencing a local page, its title is used. If the value is a distant URL, the name of the inputRootFolder is used

  • siteHomePath: path to the home of the site. It can be either an absolute URL, or a page relative to the inputRootFolder. When nothing is specified the first page of the tree is used

  • footer: footer of the site

Pages

Inside the inputRootFolder it is possible to define a tree of pages. This is useful to move or rename pages. In case of a complete site publication, the page tree is also used for the navigation.

For each page definition a folder or a single file can be referenced. In case of a folder all *.html files in the folder will be considered. With includeChildFolders it is possible to control if files in sub-folders will be traversed or not. The indexHandling option controls how the file index.html inside a folder is considered.

If the output is defined and differ from the input this means that the page will be renamed (or moved). All links in the other pages will be adjusted.

A title for each page can be specified. The title can also be derived from the HTML content (with titleSelector). When nothing is specified, the content inside <title>..</title> is used.

When a complete site is published the sitePageSelector option it tells which part of the page should be selected. By default the content inside <body>..</body> is used.

The options can be specified for each page. The defaultPageOptions inside ConfigurationHolder allows to specify options applied to all pages

Catalog

The file catalog is a way to keep track of all the published pages over time. This functionality can be interesting to ensure that redirects are presents. You can define multiple catalogs if you have to.

You can define:

  • The folder that is used as input for the catalog

  • The strategy used to compute the catalog (either based on the defined page tree or with a folder scan)

  • The outputFile where the output content is stored

  • The outputAction to specify the action that should be performed when the output is stored.

Download

The library is hosted on maven central.

Listing 1. Maven coordinates of the library
<dependency>
  <groupId>fr.jmini.utils</groupId>
  <artifactId>html-publish-helper</artifactId>
  <version>2.2.1</version>
</dependency>

Source Code

As for any java project, the source code of the plugin is available in the src/ folder.

Build

This project is using gradle.

Command to build the sources locally:

./gradlew build

Command to deploy to your local maven repository:

./gradlew publishToMavenLocal

Command to build the documentation page:

./gradlew asciidoctor

The output of this command is an HTML page located at <git repo root>/build/docs/html5/index.html.

For project maintainers

signing.gnupg.keyName and signing.gnupg.passphrase are expected to be set in your local gradle.properties file to be able to sign. sonatypeUser and sonatypePassword are expected to be set in order to be able to publish to a distant repository.

Command to build and publish the result to maven central:

./gradlew publishToSonatype

Command to upload the documentation page on GitHub pages:

./gradlew gitPublishPush

Command to perform a release:

./gradlew release -Prelease.useAutomaticVersion=true

Using ssh-agent

Some tasks requires to push into the distant git repository (release task or updating the gh-pages branch). If they are failing with errors like this:

org.eclipse.jgit.api.errors.TransportException: ... Permission denied (publickey).

Then ssh-agent can be used.

eval `ssh-agent -s`
ssh-add ~/.ssh/id_rsa

(source for this approach)

Get in touch

You can also contact me on Twitter: @j2r2b

License