Package dev.yasint.regexsynth.synthesis
Class SetExpression
java.lang.Object
dev.yasint.regexsynth.synthesis.SetExpression
- All Implemented Interfaces:
Expression
public class SetExpression extends java.lang.Object implements Expression
Synthesis :: Mutable Regular Expression Set
This generates a regular expression set when given a range or chars. This class handles the simple character class and ranged character classes expressions along with set negation.
-
Constructor Summary
Constructors Constructor Description SetExpression(boolean negated) -
Method Summary
Modifier and Type Method Description voidaddChar(int codepoint)Add a single hexadecimal/integer codepoint to this set.voidaddRange(int codepointA, int codepointB)Add a range of codepoints to this set.SetExpressiondifference(SetExpression b)Performs a subtraction of two regular expressions set.SetExpressionintersection(SetExpression b)Performs a intersection of two regular expressions set.voidnegate()Negates this set expressionjava.lang.StringBuildertoRegex()Creates a character class expression.SetExpressionunion(SetExpression b)Performs a union of two regular expressions set.SetExpressionwithUnicodeClass(UnicodeScript block, boolean negated)This allows you to include unicode blocks to a set expression.
-
Constructor Details
-
SetExpression
public SetExpression(boolean negated)
-
-
Method Details
-
negate
public void negate()Negates this set expression -
addRange
public void addRange(int codepointA, int codepointB)Add a range of codepoints to this set. This operation will iterate over the given range inclusively. And store them in thecodepointssorted set.- Parameters:
codepointA- unicode codepoint from 0x000000codepointB- unicode codepoint upto 0x10FFFF
-
addChar
public void addChar(int codepoint)Add a single hexadecimal/integer codepoint to this set. Caller may invoke this function multiple times to add all the values to the set.- Parameters:
codepoint- 0x000000 - 0x10FFFF
-
union
Performs a union of two regular expressions set. It will modify the source set @code{this} while operating.- Parameters:
b- set expression b- Returns:
- elements that belongs to this or b
-
intersection
Performs a intersection of two regular expressions set. It will modify the source set @code{this} while operating.- Parameters:
b- set expression b- Returns:
- elements that belongs to this and b
-
difference
Performs a subtraction of two regular expressions set. It will modify the source set @code{this} while operating.- Parameters:
b- set expression b- Returns:
- elements that belongs to this and not to b
-
withUnicodeClass
This allows you to include unicode blocks to a set expression. Note that when a block is included to a set. It does not check for ranges, it simply append the correct syntax to the set expression. i.e. [0-9A-Z\p{Arabic}]- Parameters:
negated- whether this unicode block is negated or notblock- valid unicode general category / script block- Returns:
- this
-
toRegex
public java.lang.StringBuilder toRegex()Creates a character class expression. This algorithm uses unicode codepoints to create character class ranges.- Specified by:
toRegexin interfaceExpression- Returns:
- set expression
-