Class GenericWordState

  • All Implemented Interfaces:
    ITokenizerState, IWordState
    Direct Known Subclasses:
    CsvWordState, ExpressionWordState

    public class GenericWordState
    extends Object
    implements IWordState
    A wordState returns a word from a scanner. Like other states, a tokenizer transfers the job of reading to this state, depending on an initial character. Thus, the tokenizer decides which characters may begin a word, and this state determines which characters may appear as a second or later character in a word. These are typically different sets of characters; in particular, it is typical for digits to appear as parts of a word, but not as the initial character of a word.

    By default, the following characters may appear in a word. The method setWordChars() allows customizing this.

     From    To
       'a', 'z'
       'A', 'Z'
       '0', '9'
    
        as well as: minus sign, underscore, and apostrophe.
     
    • Constructor Summary

      Constructors 
      Constructor Description
      GenericWordState()
      Constructs a word state with a default idea of what characters are admissible inside a word (as described in the class comment).
    • Constructor Detail

      • GenericWordState

        public GenericWordState()
                         throws Exception
        Constructs a word state with a default idea of what characters are admissible inside a word (as described in the class comment).
        Throws:
        Exception
    • Method Detail

      • nextToken

        public Token nextToken​(IScanner scanner,
                               ITokenizer tokenizer)
                        throws Exception
        Ignore word (such as blanks and tabs), and return the tokenizer's next token.
        Specified by:
        nextToken in interface ITokenizerState
        Parameters:
        scanner - A textual string to be tokenized.
        tokenizer - A tokenizer class that controls the process.
        Returns:
        The next token from the top of the stream.
        Throws:
        Exception
      • setWordChars

        public void setWordChars​(int fromSymbol,
                                 int toSymbol,
                                 boolean enable)
                          throws Exception
        Establish characters in the given range as valid characters for part of a word after the first character. Note that the tokenizer must determine which characters are valid as the beginning character of a word.
        Specified by:
        setWordChars in interface IWordState
        Parameters:
        fromSymbol - First character index of the interval.
        toSymbol - Last character index of the interval.
        enable - true if this state should use characters in the given range.
        Throws:
        Exception
      • clearWordChars

        public void clearWordChars()
        Clears definitions of word chars.
        Specified by:
        clearWordChars in interface IWordState