Class ExtractionWordsType
- java.lang.Object
-
- net.webpdf.wsclient.schema.operation.BaseExtractionType
-
- net.webpdf.wsclient.schema.operation.ExtractionWordsType
-
public class ExtractionWordsType extends BaseExtractionType
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Extract all the words from the PDF document, with page and position information.</p>
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Generates an ASCII text, XML, or JSON file that will be returned as a result when the web service is called. For each found word, the file will contain the page number and the X-axis and Y-axis coordinates of the word. When the TEXT output format is selected, only the word's text will be output, separated with line breaks.</p>
Java class for ExtractionWordsType complex type
.The following schema fragment specifies the expected content contained within this class.
<complexType name="ExtractionWordsType"> <complexContent> <extension base="{http://schema.webpdf.de/1.0/operation}BaseExtractionType"> <attribute name="removePunctuation" type="{http://www.w3.org/2001/XMLSchema}boolean" default="false" /> <attribute name="delimitAfterPunctuation" type="{http://www.w3.org/2001/XMLSchema}boolean" default="true" /> <attribute name="extendedSequenceCharacters" type="{http://www.w3.org/2001/XMLSchema}boolean" default="false" /> </extension> </complexContent> </complexType>
-
-
Field Summary
Fields Modifier and Type Field Description protected BooleandelimitAfterPunctuation<?protected BooleanextendedSequenceCharacters<?protected BooleanremovePunctuation<?-
Fields inherited from class net.webpdf.wsclient.schema.operation.BaseExtractionType
fileFormat, pages
-
-
Constructor Summary
Constructors Constructor Description ExtractionWordsType()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanisDelimitAfterPunctuation()<?booleanisExtendedSequenceCharacters()<?booleanisRemovePunctuation()<?booleanisSetDelimitAfterPunctuation()booleanisSetExtendedSequenceCharacters()booleanisSetRemovePunctuation()voidsetDelimitAfterPunctuation(boolean value)Sets the value of the delimitAfterPunctuation property.voidsetExtendedSequenceCharacters(boolean value)Sets the value of the extendedSequenceCharacters property.voidsetRemovePunctuation(boolean value)Sets the value of the removePunctuation property.voidunsetDelimitAfterPunctuation()voidunsetExtendedSequenceCharacters()voidunsetRemovePunctuation()-
Methods inherited from class net.webpdf.wsclient.schema.operation.BaseExtractionType
getFileFormat, getPages, isSetFileFormat, isSetPages, setFileFormat, setPages
-
-
-
-
Field Detail
-
removePunctuation
protected Boolean removePunctuation
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Used to specify whether punctuation marks should be included in the export or whether they should be explicitly removed.</p>
-
delimitAfterPunctuation
protected Boolean delimitAfterPunctuation
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">If this attribute is set to true, a new word will be started after each punctuation mark.</p>
-
extendedSequenceCharacters
protected Boolean extendedSequenceCharacters
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">This attribute specifies whether quotation marks and apostrophes should be handled the same way as brackets (such as parentheses and square brackets), i.e., whether they should be placed before the word they enclose.</p>
-
-
Method Detail
-
isRemovePunctuation
public boolean isRemovePunctuation()
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Used to specify whether punctuation marks should be included in the export or whether they should be explicitly removed.</p>
- Returns:
- possible object is
Boolean
-
setRemovePunctuation
public void setRemovePunctuation(boolean value)
Sets the value of the removePunctuation property.- Parameters:
value- allowed object isBoolean- See Also:
isRemovePunctuation()
-
isSetRemovePunctuation
public boolean isSetRemovePunctuation()
-
unsetRemovePunctuation
public void unsetRemovePunctuation()
-
isDelimitAfterPunctuation
public boolean isDelimitAfterPunctuation()
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">If this attribute is set to true, a new word will be started after each punctuation mark.</p>
- Returns:
- possible object is
Boolean
-
setDelimitAfterPunctuation
public void setDelimitAfterPunctuation(boolean value)
Sets the value of the delimitAfterPunctuation property.- Parameters:
value- allowed object isBoolean- See Also:
isDelimitAfterPunctuation()
-
isSetDelimitAfterPunctuation
public boolean isSetDelimitAfterPunctuation()
-
unsetDelimitAfterPunctuation
public void unsetDelimitAfterPunctuation()
-
isExtendedSequenceCharacters
public boolean isExtendedSequenceCharacters()
<?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">This attribute specifies whether quotation marks and apostrophes should be handled the same way as brackets (such as parentheses and square brackets), i.e., whether they should be placed before the word they enclose.</p>
- Returns:
- possible object is
Boolean
-
setExtendedSequenceCharacters
public void setExtendedSequenceCharacters(boolean value)
Sets the value of the extendedSequenceCharacters property.- Parameters:
value- allowed object isBoolean- See Also:
isExtendedSequenceCharacters()
-
isSetExtendedSequenceCharacters
public boolean isSetExtendedSequenceCharacters()
-
unsetExtendedSequenceCharacters
public void unsetExtendedSequenceCharacters()
-
-