public class TextCollector extends BaseCollector implements Collector
DOM, parsedItems| Constructor and Description |
|---|
TextCollector(String textBasedHTMLSource)
It sets the html source the property.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
collectText(String htmlTag)
It selects the clear text from an HTML document by the name of the HTML tag then it splits the text to sentences
and words.
|
void |
parse()
It collects the sentences and words by default value.
|
void |
parseByRule(CollectorSelector rule)
It collects the sentences and words by the rule identifier.
|
collectAttributeValueBy, collectAttributeValueBy, getItems, resetParsedItemspublic TextCollector(String textBasedHTMLSource)
textBasedHTMLSource - A regular text based HTML code as a String.public void parse()
public void parseByRule(CollectorSelector rule)
parseByRule in interface Collectorrule - The ID of the rule.protected void collectText(String htmlTag)
htmlTag - The html tag as a String. "title" for example.Copyright © 2015. All rights reserved.