Class BidiOrder

java.lang.Object
com.lowagie.text.pdf.BidiOrder

public final class BidiOrder extends Object
Reference implementation of the Unicode 3.0 Bidi algorithm.

This implementation is not optimized for performance. It is intended as a reference implementation that closely follows the specification of the Bidirectional Algorithm in The Unicode Standard version 3.0.

Input:
There are two levels of input to the algorithm, since clients may prefer to supply some information from out-of-band sources rather than relying on the default behavior.

  1. unicode type array
  2. unicode type array, with externally supplied base line direction

Output:
Output is separated into several stages as well, to better enable clients to evaluate various aspects of implementation conformance.

  1. levels array over entire paragraph
  2. reordering array over entire paragraph
  3. levels array over line
  4. reordering array over line
Note that for conformance, algorithms are only required to generate correct reordering and character directionality (odd or even levels) over a line. Generating identical level arrays over a line is not required. Bidi explicit format codes (LRE, RLE, LRO, RLO, PDF) and BN can be assigned arbitrary levels and positions as long as the other text matches.

As the algorithm is defined to operate on a single paragraph at a time, this implementation is written to handle single paragraphs. Thus rule P1 is presumed by this implementation-- the data provided to the implementation is assumed to be a single paragraph, and either contains no 'B' codes, or a single 'B' code at the end of the input. 'B' is allowed as input to illustrate how the algorithm assigns it a level.

Also note that rules L3 and L4 depend on the rendering engine that uses the result of the bidi algorithm. This implementation assumes that the rendering engine expects combining marks in visual order (e.g. to the left of their base character in RTL runs) and that it adjust the glyphs used to render mirrored characters that are in RTL runs so that they render appropriately.

Author:
Doug Felt
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final byte
    Right-to-Left Arabic
    static final byte
    Arabic Number
    static final byte
    Paragraph Separator
    static final byte
    Boundary Neutral
    static final byte
    Common Number Separator
    static final byte
    European Number
    static final byte
    European Number Separator
    static final byte
    European Number Terminator
    static final byte
    Left-to-right
    static final byte
    Left-to-Right Embedding
    static final byte
    Left-to-Right Override
    static final byte
    Non-Spacing Mark
    static final byte
    Other Neutrals
    static final byte
    Pop Directional Format
    static final byte
    Right-to-Left
    static final byte
    Right-to-Left Embedding
    static final byte
    Right-to-Left Override
    static final byte
    Segment Separator
    static final byte
    Maximum bidi type value.
    static final byte
    Minimum bidi type value.
    static final byte
    Whitespace
  • Constructor Summary

    Constructors
    Constructor
    Description
    BidiOrder(byte[] types)
    Initialize using an array of direction types.
    BidiOrder(byte[] types, byte paragraphEmbeddingLevel)
    Initialize using an array of direction types and an externally supplied paragraph embedding level.
    BidiOrder(char[] text, int offset, int length, byte paragraphEmbeddingLevel)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    byte
    Return the base level of the paragraph.
    static final byte
    getDirection(char c)
     
    byte[]
     
    byte[]
    getLevels(int[] linebreaks)
    Return levels array breaking lines at offsets in linebreaks.
    int[]
    getReordering(int[] linebreaks)
    Return reordering array breaking lines at offsets in linebreaks.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • BidiOrder

      public BidiOrder(byte[] types)
      Initialize using an array of direction types. Types range from TYPE_MIN to TYPE_MAX inclusive and represent the direction codes of the characters in the text.
      Parameters:
      types - the types array
    • BidiOrder

      public BidiOrder(byte[] types, byte paragraphEmbeddingLevel)
      Initialize using an array of direction types and an externally supplied paragraph embedding level. The embedding level may be -1, 0, or 1. -1 means to apply the default algorithm (rules P2 and P3), 0 is for LTR paragraphs, and 1 is for RTL paragraphs.
      Parameters:
      types - the types array
      paragraphEmbeddingLevel - the externally supplied paragraph embedding level.
    • BidiOrder

      public BidiOrder(char[] text, int offset, int length, byte paragraphEmbeddingLevel)
  • Method Details

    • getDirection

      public static final byte getDirection(char c)
    • getLevels

      public byte[] getLevels()
    • getLevels

      public byte[] getLevels(int[] linebreaks)
      Return levels array breaking lines at offsets in linebreaks.
      Rule L1.

      The returned levels array contains the resolved level for each bidi code passed to the constructor.

      The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.

      Parameters:
      linebreaks - the offsets at which to break the paragraph
      Returns:
      the resolved levels of the text
    • getReordering

      public int[] getReordering(int[] linebreaks)
      Return reordering array breaking lines at offsets in linebreaks.

      The reordering array maps from a visual index to a logical index. Lines are concatenated from left to right. So for example, the fifth character from the left on the third line is

       getReordering(linebreaks)[linebreaks[1] + 4]
       
      (linebreaks[1] is the position after the last character of the second line, which is also the index of the first character on the third line, and adding four gets the fifth character from the left).

      The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.

      Parameters:
      linebreaks - the offsets at which to break the paragraph.
    • getBaseLevel

      public byte getBaseLevel()
      Return the base level of the paragraph.