Class SemanticLineBreaker

java.lang.Object
network.ike.tools.linebreak.SemanticLineBreaker

public class SemanticLineBreaker extends Object
Reformats AsciiDoc files to use semantic linefeeds.

Line breaks are placed at logical boundaries — sentences, clauses, asides, introductions, and compound-sentence joints — producing source text that is easier to diff, edit, and reason about.

Uses AsciidoctorJ to parse the document AST, identifying paragraph blocks that contain prose. Only those blocks are reformatted; delimited blocks (listings, diagrams, tables, passthroughs, etc.) are never touched.

Default breaking rules (in priority order):

  1. Sentence ends: . ? ! followed by a space and uppercase letter
  2. Closing quote after sentence: ." ?" !" followed by space and uppercase
  3. Em-dash (Unicode ) followed by space
  4. Em-dash (AsciiDoc --) surrounded by spaces
  5. Semicolon followed by space
  6. Colon followed by space (guarded against URLs, times, definition lists)
  7. Comma followed by coordinating conjunction (and, but, or, yet, so, nor)
  8. Simple comma clause break (optional, threshold-gated)

Use --sentences-only to restrict breaking to sentence boundaries only (rules 1-2 above).

Hard line breaks (" +" at end of line) are preserved. Abbreviations (Dr., Mr., e.g., i.e., etc.) are recognized and not treated as sentence ends.

  • Constructor Details

    • SemanticLineBreaker

      public SemanticLineBreaker()
      Creates a semantic line breaker instance.
  • Method Details

    • main

      public static void main(String[] args) throws IOException
      Command-line entry point for the semantic linebreak reformatter.
      Parameters:
      args - command-line arguments specifying input files and options
      Throws:
      IOException - if an I/O error occurs while reading or writing files