Class ExtractionWordsType


  • public class ExtractionWordsType
    extends BaseExtractionType
     <?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Extract all the words from the PDF document, with page and position information.</p>
     
     <?xml version="1.0" encoding="UTF-8"?><p xmlns:tns="http://schema.webpdf.de/1.0/operation" xmlns:xs="http://www.w3.org/2001/XMLSchema">Generates an ASCII text, XML, or JSON file that will be returned as a result when the web service is called. For each found word, the file will contain the page number and the X-axis and Y-axis coordinates of the word. When the TEXT output format is selected, only the word's text will be output, separated with line breaks.</p>
     

    Java class for ExtractionWordsType complex type.

    The following schema fragment specifies the expected content contained within this class.

    
     <complexType name="ExtractionWordsType">
       <complexContent>
         <extension base="{http://schema.webpdf.de/1.0/operation}BaseExtractionType">
           <attribute name="removePunctuation" type="{http://www.w3.org/2001/XMLSchema}boolean" default="false" />
           <attribute name="delimitAfterPunctuation" type="{http://www.w3.org/2001/XMLSchema}boolean" default="true" />
           <attribute name="extendedSequenceCharacters" type="{http://www.w3.org/2001/XMLSchema}boolean" default="false" />
         </extension>
       </complexContent>
     </complexType>
     
    • Field Detail

      • removePunctuation

        protected Boolean removePunctuation
      • delimitAfterPunctuation

        protected Boolean delimitAfterPunctuation
      • extendedSequenceCharacters

        protected Boolean extendedSequenceCharacters
    • Constructor Detail

      • ExtractionWordsType

        public ExtractionWordsType()
    • Method Detail

      • isRemovePunctuation

        public boolean isRemovePunctuation()
        Gets the value of the removePunctuation property.
        Returns:
        possible object is Boolean
      • setRemovePunctuation

        public void setRemovePunctuation​(boolean value)
        Sets the value of the removePunctuation property.
        Parameters:
        value - allowed object is Boolean
      • isSetRemovePunctuation

        public boolean isSetRemovePunctuation()
      • unsetRemovePunctuation

        public void unsetRemovePunctuation()
      • isDelimitAfterPunctuation

        public boolean isDelimitAfterPunctuation()
        Gets the value of the delimitAfterPunctuation property.
        Returns:
        possible object is Boolean
      • setDelimitAfterPunctuation

        public void setDelimitAfterPunctuation​(boolean value)
        Sets the value of the delimitAfterPunctuation property.
        Parameters:
        value - allowed object is Boolean
      • isSetDelimitAfterPunctuation

        public boolean isSetDelimitAfterPunctuation()
      • unsetDelimitAfterPunctuation

        public void unsetDelimitAfterPunctuation()
      • isExtendedSequenceCharacters

        public boolean isExtendedSequenceCharacters()
        Gets the value of the extendedSequenceCharacters property.
        Returns:
        possible object is Boolean
      • setExtendedSequenceCharacters

        public void setExtendedSequenceCharacters​(boolean value)
        Sets the value of the extendedSequenceCharacters property.
        Parameters:
        value - allowed object is Boolean
      • isSetExtendedSequenceCharacters

        public boolean isSetExtendedSequenceCharacters()
      • unsetExtendedSequenceCharacters

        public void unsetExtendedSequenceCharacters()