Class SetExpression

java.lang.Object
dev.yasint.regexsynth.synthesis.SetExpression
All Implemented Interfaces:
Expression

public class SetExpression
extends java.lang.Object
implements Expression
Synthesis :: Mutable Regular Expression Set

This generates a regular expression set when given a range or chars. This class handles the simple character class and ranged character classes expressions along with set negation.

  • Constructor Summary

    Constructors 
    Constructor Description
    SetExpression​(boolean negated)  
  • Method Summary

    Modifier and Type Method Description
    void addChar​(int codepoint)
    Add a single hexadecimal/integer codepoint to this set.
    void addRange​(int codepointA, int codepointB)
    Add a range of codepoints to this set.
    SetExpression difference​(SetExpression b)
    Performs a subtraction of two regular expressions set.
    SetExpression intersection​(SetExpression b)
    Performs a intersection of two regular expressions set.
    void negate()
    Negates this set expression
    java.lang.StringBuilder toRegex()
    Creates a character class expression.
    SetExpression union​(SetExpression b)
    Performs a union of two regular expressions set.
    SetExpression withUnicodeClass​(UnicodeScript block, boolean negated)
    This allows you to include unicode blocks to a set expression.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface dev.yasint.regexsynth.api.Expression

    debug
  • Constructor Details

  • Method Details

    • negate

      public void negate()
      Negates this set expression
    • addRange

      public void addRange​(int codepointA, int codepointB)
      Add a range of codepoints to this set. This operation will iterate over the given range inclusively. And store them in the codepoints sorted set.
      Parameters:
      codepointA - unicode codepoint from 0x000000
      codepointB - unicode codepoint upto 0x10FFFF
    • addChar

      public void addChar​(int codepoint)
      Add a single hexadecimal/integer codepoint to this set. Caller may invoke this function multiple times to add all the values to the set.
      Parameters:
      codepoint - 0x000000 - 0x10FFFF
    • union

      public SetExpression union​(SetExpression b)
      Performs a union of two regular expressions set. It will modify the source set @code{this} while operating.
      Parameters:
      b - set expression b
      Returns:
      elements that belongs to this or b
    • intersection

      public SetExpression intersection​(SetExpression b)
      Performs a intersection of two regular expressions set. It will modify the source set @code{this} while operating.
      Parameters:
      b - set expression b
      Returns:
      elements that belongs to this and b
    • difference

      public SetExpression difference​(SetExpression b)
      Performs a subtraction of two regular expressions set. It will modify the source set @code{this} while operating.
      Parameters:
      b - set expression b
      Returns:
      elements that belongs to this and not to b
    • withUnicodeClass

      public SetExpression withUnicodeClass​(UnicodeScript block, boolean negated)
      This allows you to include unicode blocks to a set expression. Note that when a block is included to a set. It does not check for ranges, it simply append the correct syntax to the set expression. i.e. [0-9A-Z\p{Arabic}]
      Parameters:
      negated - whether this unicode block is negated or not
      block - valid unicode general category / script block
      Returns:
      this
    • toRegex

      public java.lang.StringBuilder toRegex()
      Creates a character class expression. This algorithm uses unicode codepoints to create character class ranges.
      Specified by:
      toRegex in interface Expression
      Returns:
      set expression