Sometimes when extracting text from another item may result in formatting issues that involve extra blank lines or leading/trailing whitespace on each line.
This commonly occurs when extracting from HTML elements or XML documents.
The following String regular expressions can fix the following issues.
(?m) = multi-line mode
The following removes the leading/trailing whitespace from each line in the string.
node.getTextContent().replaceAll( "(?m)^[\\s&&[^\n]]+|[\\s+&&[^\n]]+$", "");
Example: The quick brown fox jumps over the lazy dog. Result: The quick brown fox jumps over the lazy dog.
The following removes extra blank lines from