public class TextCollector extends BaseCollector implements Collector
DOM, parsedItems
Constructor and Description |
---|
TextCollector(String textBasedHTMLSource)
It sets the html source the property.
|
Modifier and Type | Method and Description |
---|---|
protected void |
collectText(String htmlTag)
It selects the clear text from an HTML document by the name of the HTML tag then it splits the text to sentences
and words.
|
void |
parse()
It collects the sentences and words by default value.
|
void |
parseByRule(CollectorSelector rule)
It collects the sentences and words by the rule identifier.
|
collectAttributeValueBy, collectAttributeValueBy, getItems, resetParsedItems
public TextCollector(String textBasedHTMLSource)
textBasedHTMLSource
- A regular text based HTML code as a String.public void parse()
public void parseByRule(CollectorSelector rule)
parseByRule
in interface Collector
rule
- The ID of the rule.protected void collectText(String htmlTag)
htmlTag
- The html tag as a String. "title" for example.Copyright © 2015. All rights reserved.