URLs

Do you want to improve this page? Please edit it on GitHub.

Description

Html checker is a java program that checks the content of your HTML pages and produces a report. It focuses on the content (broken links) and not on the form (is your HTML well-formed?). It is available as a command line tool or as a maven plugin.

Htmlchecker scans the .html files present in a given directory and verify that certain rules are violated or not. A report with all warnings and errors can be obtained as HTML or as XML file.

Checked rules

Following rules ensures that local files referenced as attribute of some html tags are present. Missing files correspond to an error.

Output report

The html report looks like this:

HTML Report

See an example for the site-example project: example-report.html.

The xml report looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<issues>
    <issue
        name="LOCAL_A_TAG_RULE"
        severity="Error"
        message="File &apos;page2.html&apos; (relative to &apos;index.html&apos;) ......"
        category="Local"
        summary="Local file should be present"
        explanation="All local files referenced by the `href` attribute ......"
        location="....../site-example/src/main/resources/site/index.html"
        lineNumber="15"
    />
    <!-- ..... -->
</issues>

See an example for the site-example project: example-report.xml.

Download

Since version 1.2.0, a compiled version of the project is available on maven central.

You might use maven or gradle to fetch the dependencies you need, depending on your use case. You can also directly use the maven plugin as step of your build.

For the command line tool, a zip file is also available: htmlchecker-cli-1.2.4-SNAPSHOT-bin.zip. It contains all the requested dependencies in a lib subfolder and basic startup scripts.

Run the tool

As maven plugin

In the plugins section of your build section, you can add something like this:

Maven plugin usage example
<plugin>
  <groupId>fr.jmini.htmlchecker</groupId>
  <artifactId>htmlchecker-maven-plugin</artifactId>
  <version>1.2.3-SNAPSHOT</version>
  <executions>
    <execution>
      <id>generate-html-report</id>
      <phase>test</phase>
      <goals>
        <goal>generate-report</goal>
      </goals>
      <configuration>
        <sourceDirectory>${site.location}</sourceDirectory>
        <outputType>html</outputType>
        <outputFile>${project.build.directory}${file.separator}myreport.html</outputFile>
        <srcPathPrefix>${site.github}</srcPathPrefix>
      </configuration>
    </execution>
  </executions>
</plugin>

Possible configuration parameters for the maven plugin:

sourceDirectory

Directory to scan (default value is ${project.basedir}).

outputType

Type of report that should be created, possible values:

  • html (default)

  • xml

outputFile

Name of the file, where the report will be created (default value is ${project.build.directory}/report.html).

srcPathPrefix

Local or remote path to the source directory, if not set (default) a relative path to the local file will be computed. For the site-example project in this repository, the remote path to view the files on GitHub is: https://github.com/jmini/htmlchecker/blob/master/site-example/src/main/resources/site/.

enableOnlyRules

Only check for these rules. Ignored if not set (default).

disableRules

Disable the list of rules. Ignored if not set (default).

enableRules

Enable the list of rules. Ignored if not set (default).

enableCategories

Run all rules of certain categories. Ignored if not set (default).

noWarnings

Only check for errors; ignore warnings. Possible values:

  • false (default)

  • true

allWarnings

Check all warnings, including those off by default. Possible values:

  • false (default)

  • true

waringsAreErrors

Treat all warnings as errors. Possible values:

  • false (default)

  • true

From the command line

After having unzipped the archive, you can run htmlchecker.cmd or htmlchecker.sh depending on your operating system.

Run the command line tool example
htmlchecker --html myreport.html site-example/src/main/resources/site/

Here is the complete list of options:

Complete list of options for htmlchecker
usage: htmlchecker [flags] <directory>
 -h,--help                     Usage information, help message.
 -v,--version                  Output version information.
 -p,--profile                  Measure time every rule takes to complete.
 -l,--list                     Lists lint rules with a short, summary
                               explanation.
 -b,--web <port>               Run in the background, as a website.
                               (default port: 8380)
 -r,--rules                    Prints a Markdown dump of the program's
                               rules.
 -s,--show <RULE[s]>           Lists a verbose rule explanation.
 -c,--check <RULE[s]>          Only check for these rules.
 -d,--disable <RULE[s]>        Disable the list of rules.
 -e,--enable <RULE[s]>         Enable the list of rules.
 -y,--category <CATEGORY[s]>   Run all rules of a certain category.
 -w,--nowarn                   Only check for errors; ignore warnings.
 -Wall,--Wall                  Check all warnings, including those off by
                               default.
 -Werror,--Werror              Treat all warnings as errors.
 -q,--quiet                    Don't output any progress or reports.
 -t,--html <filename>          Create an HTML report.
 -x,--xml <filename>           Create an XML (!!) report.
 -j,--jenkins-xml <filename>   Create an XML Jenkins format (!!) report.
 -sp,--srcpath <PATH-PREFIX>   Local or remote path to the source
                               directory, if not set a relative path to
                               the local file will be computed.

<RULE[s]> should be comma separated, without spaces.
<PATH-PREFIX>:
Links to the source code files will use this value as value as prefix.
Possible values:
 - relative path: ../../my-project/
 - absolute path: file:///C:/work/my-project/
 - online path: https://github.com/selesse/jxlint/blob/master/jxlint-impl/

Exit Status:
0                     Success
1                     Failed
2                     Command line error

Jenkins integration

With version 1.2.0 and newer a new export format is available: jenkins-xml The generated XML file can be parsed by Jenkins using the Warnings Next Generation Plugin (tested with version 5.3.0).

In your build configuration, in the "Post-build Actions" section, add a "Record compiler warnings and static analysis results" item:

Record compiler warnings and static analysis results configuration

Configure the item with following values:

  • Tool: Warnings Plugin Native Format

  • Report File Pattern: target/htmlchecker-report.xml (example value, must match with the output file created when you run the tool)

  • Custom ID: htmlchecker (optional)

  • Custom Name: HTML Checker (optional)

Here the same configuration as Jenkins pipeline script:

Jenkins pipeline syntax
recordIssues(tools: [issues(id: 'htmlchecker', name: 'HTML Checker', pattern: 'target/htmlchecker-report.xml')])

Source Code

The project is hosted on GitHub: jmini/htmlchecker

Build

This project is using gradle.

Command to build the sources locally:

./gradlew build

Command to deploy to your local maven repository:

./gradlew publishToMavenLocal

Command to build the documentation page:

./gradlew asciidoctor

The output of this command is an HTML page located at <git repo root>/build/docs/html5/index.html.

For project maintainers

signing.gnupg.keyName and signing.gnupg.passphrase are expected to be set in your local gradle.properties file to be able to sign. sonatypeUser and sonatypePassword are expected to be set in order to be able to publish to a distant repository.

Command to build and publish the result to maven central:

./gradlew publishToSonatype

Command to upload the documentation page on GitHub pages:

./gradlew gitPublishPush

Command to perform a release:

./gradlew release -Prelease.useAutomaticVersion=true

Using ssh-agent

Some tasks requires pushing into the distant git repository (release task or updating the gh-pages branch). If they are failing with errors like this:

org.eclipse.jgit.api.errors.TransportException: ... Permission denied (publickey).

Then ssh-agent can be used.

eval `ssh-agent -s`
ssh-add ~/.ssh/id_rsa

(source for this approach)

Get in touch

Use the htmlchecker issue tracker on GitHub.

You can also contact me on Twitter: @j2r2b

License