Java library for matching text ranges
URLs
-
Project home (this page)
| Do you want to improve this page? Please edit it on GitHub. |
Description
Java library to find:
-
text ranges (defined by a start string and an end string) that can be included in each other.
-
specific string sequence outside a range.
Text range
First example
In following text, you want to match ranges that starts with ( and ends with ):
5 + (4 + (1 + 2) / 3 - 5) * 10 / (3 + 2)
The first range is expected to be (4 + (1 + 2) / 3 - 5).
This result (4 + (1 + 2) (matching the first opening bracket and the first closing bracket) is wrong.
This other result (4 + (1 + 2) / 3 - 5) * 10 / (3 + 2) (matching the first opening bracket and the last closing bracket) is also wrong.
Corresponding java code with SubstringFinder:
String text = "5 + (4 + (1 + 2) / 3 - 5) * 10 / (3 + 2)";
SubstringFinder finder = SubstringFinder.define("(", ")");
Optional<Range> findRange = finder.nextRange(text);
if (findRange.isPresent()) {
Range range = findRange.get();
String substring = text.substring(range.getRangeStart(), range.getRangeEnd());
assertEquals("(4 + (1 + 2) / 3 - 5)", substring);
}
Where SubstringFinder corresponds to this imported class: fr.jmini.utils.substringfinder.SubstringFinder.
Second example
Find the correct range defined { by } corresponding to the main method:
public static void main(String[] args) {
if(args != null) {
for (String s : args) {
printArg(s);
}
}
}
public static void printArg(String arg) {
System.out.println("Arg: " + arg);
}
Exclude an other range
Consider this example:
package tmp;
@SomeAnnotation(arg1="value ;-)", arg2="other value")
public class SomeClass {
}
If you would like to find the content of the @SomeAnnotation value, you can define your range like this:
-
start:
@SomeAnnotation( -
end
)
But in this case you also need to exclude the content between the quotes (" .. "), in order to not match the end of the range with the :-) in the String.
String text = ""
+ "package tmp;\n"
+ " \n"
+ "@SomeAnnotation(arg1=\"value ;-)\", arg2=\"other value\")\n"
+ "public class SomeClass {\n"
+ "}\n";
SubstringFinder finder = SubstringFinder.define("@SomeAnnotation(", ")", "\"", "\"");
Optional<Range> findRange = finder.nextRange(text);
if (findRange.isPresent()) {
Range range = findRange.get();
String substring = text.substring(range.getRangeStart(), range.getRangeEnd());
assertEquals("@SomeAnnotation(arg1=\"value ;-)\", arg2=\"other value\")", substring);
}
String positions outside a range
First example
In following text, you want to find the comma , outside ranges defined by single quote ':
'Hello,world',5,true
The two correct matches are:
-
Between
'and5 -
Between
5andt
The first comma should be ignored because it is between ' and '.
Corresponding java code with PositionFinder:
String text = "'Hello,world',5,true";
PositionFinder finder = PositionFinder.define(",", "'", "'");
List<Integer> findPositions = finder.indexesOf(text);
assertEquals(2, findPositions.size(), "size");
assertEquals("'Hello,world'", text.substring(0, findPositions.get(0)));
assertEquals("5", text.substring(findPositions.get(0) + 1, findPositions.get(1)));
assertEquals("true", text.substring(findPositions.get(1) + 1));
Where PositionFinder corresponds to this imported class: fr.jmini.utils.substringfinder.PositionFinder.
First position outside a range
Similar to the previous example, if you are only interested by the first position of the , outside ( and ):
lorem(Hello,world),ipsum
Corresponding java code with PositionFinder:
String text = "lorem(Hello,world),ipsum";
PositionFinder finder = PositionFinder.define(",", "(", ")");
Optional<Integer> findPosition = finder.indexOf(text);
assertEquals(true, findPosition.isPresent(), "isPresent");
assertEquals("lorem(Hello,world)", text.substring(0, findPosition.get()));
assertEquals("ipsum", text.substring(findPosition.get() + 1));
Where PositionFinder corresponds to this imported class: fr.jmini.utils.substringfinder.PositionFinder.
Download
Starting with version 1.0.1, the library is hosted on maven central.
<dependency>
<groupId>fr.jmini.utils</groupId>
<artifactId>substring-finder</artifactId>
<version>1.1.1</version>
</dependency>
Build
This project is using gradle.
Command to build the sources locally:
./gradlew build
Command to deploy to your local maven repository:
./gradlew publishToMavenLocal
Command to build the documentation page:
./gradlew asciidoctor
The output of this command is an HTML page located at <git repo root>/build/docs/html5/index.html.
For project maintainers
signing.gnupg.keyName and signing.gnupg.passphrase are expected to be set in your local gradle.properties file to be able to sign.
sonatypeUser and sonatypePassword are expected to be set in order to be able to publish to a distant repository.
Command to build and publish the result to maven central:
./gradlew publishToSonatype
Command to upload the documentation page on GitHub pages:
./gradlew gitPublishPush
Command to perform a release:
./gradlew release -Prelease.useAutomaticVersion=true
Using ssh-agent
Some tasks requires to push into the distant git repository (release task or updating the gh-pages branch).
If they are failing with errors like this:
org.eclipse.jgit.api.errors.TransportException: ... Permission denied (publickey).
Then ssh-agent can be used.
eval `ssh-agent -s` ssh-add ~/.ssh/id_rsa
(source for this approach)
Get in touch
Use the issue tracker on GitHub.
You can also contact me on Twitter: @j2r2b
License
Code is under Eclipse Public License - v 2.0. Documentation and slides are under the Creative Commons BY-SA 4.0