Unit BGRAUnicode

📄 Source code

Description

Implementation of Unicode bidirectional algorithm

Uses

Overview

Structures

Name Description
Packed Record TUnicodeBidiInfo Bidirectional layout information for one Unicode character
Record TUnicodeBracketInfo Pair of matching brackets

Functions and Procedures

function AnalyzeBidiUnicode(u: PLongWord; ALength: integer; baseDirection: LongWord): TUnicodeBidiArray;
function AnalyzeBidiUnicode(u: PLongWord; ALength: integer; ABidiMode: TFontBidiMode): TUnicodeBidiArray;
function GetUnicodeBidiClass(u: LongWord): TUnicodeBidiClass;
function GetUnicodeBidiClassEx(u: LongWord): TUnicodeBidiClass;
function GetUnicodeBracketInfo(u: LongWord): TUnicodeBracketInfo;
function GetUnicodeCombiningClass(u: LongWord): byte;
function GetUnicodeDisplayOrder(const AInfo: TUnicodeBidiArray): TUnicodeDisplayOrder; overload;
function GetUnicodeDisplayOrder(ABidiInfo: PUnicodeBidiInfo; AStride, ACount: integer): TUnicodeDisplayOrder; overload;
function GetUnicodeDisplayOrder(ALevels: PByte; ACount: integer): TUnicodeDisplayOrder; overload;
function GetUnicodeJoiningType(u: LongWord): TUnicodeJoiningType;
function IsModifierCombiningMark(u: LongWord): boolean;
function IsUnicodeCrLf(u: LongWord): boolean;
function IsUnicodeIsolateOrFormatting(u: LongWord): boolean;
function IsUnicodeMirrored(u: LongWord): boolean;
function IsUnicodeParagraphSeparator(u: LongWord): boolean;
function IsUnicodeSpace(u: LongWord): boolean;
function IsZeroWidthUnicode(u: LongWord): boolean;

Types

PUnicodeBidiInfo = ˆTUnicodeBidiInfo;
TFontBidiMode = (...);
TUnicodeBidiArray = packed array of TUnicodeBidiInfo;
TUnicodeBidiClass = (...);
TUnicodeDisplayOrder = array of integer;
TUnicodeJoiningType = (...);

Constants

BIDI_FLAG_COMBINING_LEFT = 1024;
BIDI_FLAG_COMBINING_RIGHT = 2048;
BIDI_FLAG_END_OF_LINE = 8;
BIDI_FLAG_EXPLICIT_END_OF_PARAGRAPH = 4;
BIDI_FLAG_IMPLICIT_END_OF_PARAGRAPH = 2;
BIDI_FLAG_LIGATURE_BOUNDARY = 64;
BIDI_FLAG_LIGATURE_LEFT = 32;
BIDI_FLAG_LIGATURE_RIGHT = 16;
BIDI_FLAG_LIGATURE_TRANSPARENT = 128;
BIDI_FLAG_MIRRORED = 8192;
BIDI_FLAG_MULTICHAR_START = 4096;
BIDI_FLAG_NON_SPACING_MARK = 512;
BIDI_FLAG_REMOVED = 1;
BIDI_FLAG_RTL_SCRIPT = 256;
ubcNeutral = [ubcSegmentSeparator, ubcParagraphSeparator, ubcWhiteSpace, ubcOtherNeutrals];
UNICODE_ARABIC_LETTER_MARK = $061C;
UNICODE_ARABIC_TATWEEL = $0640;
UNICODE_COMBINING_GRAPHEME_JOINER = $034F;
UNICODE_FIRST_STRONG_ISOLATE = $2068;
UNICODE_FULLWIDTH_COMMA = $FF0C;
UNICODE_HORIZONTAL_ELLIPSIS = $2026;
UNICODE_IDEOGRAPHIC_COMMA = $3001;
UNICODE_IDEOGRAPHIC_FULL_STOP = $3002;
UNICODE_INFORMATION_SEPARATOR_FOUR = $001C;
UNICODE_INFORMATION_SEPARATOR_ONE = $001F;
UNICODE_INFORMATION_SEPARATOR_THREE = $001D;
UNICODE_INFORMATION_SEPARATOR_TWO = $001E;
UNICODE_LEFT_TO_RIGHT_EMBEDDING = $202A;
UNICODE_LEFT_TO_RIGHT_ISOLATE = $2066;
UNICODE_LEFT_TO_RIGHT_MARK = $200E;
UNICODE_LEFT_TO_RIGHT_OVERRIDE = $202D;
UNICODE_LINE_SEPARATOR = $2028;
UNICODE_MAX_BIDI_DEPTH = 125;
UNICODE_NEXT_LINE = $0085;
UNICODE_NO_BREAK_SPACE = $A0;
UNICODE_PARAGRAPH_SEPARATOR = $2029;
UNICODE_POP_DIRECTIONAL_FORMATTING = $202C;
UNICODE_POP_DIRECTIONAL_ISOLATE = $2069;
UNICODE_RIGHT_ANGLE_BRACKET = $3009;
UNICODE_RIGHT_POINTING_ANGLE_BRACKET = $232A;
UNICODE_RIGHT_TO_LEFT_EMBEDDING = $202B;
UNICODE_RIGHT_TO_LEFT_ISOLATE = $2067;
UNICODE_RIGHT_TO_LEFT_MARK = $200F;
UNICODE_RIGHT_TO_LEFT_OVERRIDE = $202E;
UNICODE_ZERO_WIDTH_JOINER = $200D;
UNICODE_ZERO_WIDTH_NON_JOINER = $200C;
UNICODE_ZERO_WIDTH_NO_BREAK_SPACE = $FEFF;
UNICODE_ZERO_WIDTH_SPACE = $200B;

Description

Functions and Procedures

function AnalyzeBidiUnicode(u: PLongWord; ALength: integer; baseDirection: LongWord): TUnicodeBidiArray;

Analyze unicode and return bidi levels for each character. baseDirection can be either UNICODE_LEFT_TO_RIGHT_ISOLATE, UNICODE_RIGHT_TO_LEFT_ISOLATE or UNICODE_FIRST_STRONG_ISOLATE

function AnalyzeBidiUnicode(u: PLongWord; ALength: integer; ABidiMode: TFontBidiMode): TUnicodeBidiArray;

This item has no description.

function GetUnicodeBidiClass(u: LongWord): TUnicodeBidiClass;

Returns the Bidi class as defined by Unicode used to determine text direction

function GetUnicodeBidiClassEx(u: LongWord): TUnicodeBidiClass;

Same as above but returns additional classes: ubcCombiningLeftToRight and ubcMirroredNeutral

function GetUnicodeBracketInfo(u: LongWord): TUnicodeBracketInfo;

This item has no description.

function GetUnicodeCombiningClass(u: LongWord): byte;

Returns the Combining class defined by unicode for non-spacing marks and combining marks or 255 if the character is not to be combined

function GetUnicodeDisplayOrder(const AInfo: TUnicodeBidiArray): TUnicodeDisplayOrder; overload;

Determine diplay order, provided the display surface is horizontally infinite

function GetUnicodeDisplayOrder(ABidiInfo: PUnicodeBidiInfo; AStride, ACount: integer): TUnicodeDisplayOrder; overload;

This item has no description.

function GetUnicodeDisplayOrder(ALevels: PByte; ACount: integer): TUnicodeDisplayOrder; overload;

This item has no description.

function GetUnicodeJoiningType(u: LongWord): TUnicodeJoiningType;

Returns how the letter can be joined to the surrounding letters (for example in arabic)

function IsModifierCombiningMark(u: LongWord): boolean;

This item has no description.

function IsUnicodeCrLf(u: LongWord): boolean;

This item has no description.

function IsUnicodeIsolateOrFormatting(u: LongWord): boolean;

This item has no description.

function IsUnicodeMirrored(u: LongWord): boolean;

Returns if the symbol can be mirrored horizontally for right-to-left text

function IsUnicodeParagraphSeparator(u: LongWord): boolean;

This item has no description.

function IsUnicodeSpace(u: LongWord): boolean;

This item has no description.

function IsZeroWidthUnicode(u: LongWord): boolean;

This item has no description.

Types

PUnicodeBidiInfo = ˆTUnicodeBidiInfo;

the glyph is mirrored when in RTL text

TFontBidiMode = (...);

Bidi-mode preference (right-to-left or left-to-right)

Values
  • fbmAuto: Automatic bidi-mode, depending on first letter type
  • fbmLeftToRight: Always left-to-right (but can embed another direction)
  • fbmRightToLeft: Always right-to-left (but can embed another direction)
TUnicodeBidiArray = packed array of TUnicodeBidiInfo;

This item has no description.

TUnicodeBidiClass = (...);

This item has no description.

Values
  • ubcBoundaryNeutral
  • ubcSegmentSeparator
  • ubcParagraphSeparator
  • ubcWhiteSpace
  • ubcOtherNeutrals
  • ubcCommonSeparator
  • ubcNonSpacingMark
  • ubcLeftToRight
  • ubcEuropeanNumber
  • ubcEuropeanNumberSeparator
  • ubcEuropeanNumberTerminator
  • ubcRightToLeft
  • ubcArabicLetter
  • ubcArabicNumber
  • ubcUnknown
  • ubcCombiningLeftToRight
  • ubcMirroredNeutral: ubcLeftToRight in Mc category
TUnicodeDisplayOrder = array of integer;

This item has no description.

TUnicodeJoiningType = (...);

ubcOtherNeutrals with Mirrored property

Values
  • ujtNonJoining
  • ujtTransparent: U
  • ujtRightJoining: T
  • ujtLeftJoining: R
  • ujtDualJoining: L
  • ujtJoinCausing: D

Constants

BIDI_FLAG_COMBINING_LEFT = 1024;

it is a non-spacing mark

BIDI_FLAG_COMBINING_RIGHT = 2048;

this letter is to be combined to the left of previous letter

BIDI_FLAG_END_OF_LINE = 8;

explicit end of paragraph (paragraph spacing below due to paragraph split)

BIDI_FLAG_EXPLICIT_END_OF_PARAGRAPH = 4;

implicit end of paragraph (paragraph spacing below due to end of text)

BIDI_FLAG_IMPLICIT_END_OF_PARAGRAPH = 2;

RLE, LRE, RLO, LRO, PDF and BN are supposed to be removed

BIDI_FLAG_LIGATURE_BOUNDARY = 64;

joins to the letter on the left (possible for joining type L and D)

BIDI_FLAG_LIGATURE_LEFT = 32;

joins to the letter on the right (possible for joining type R and D)

BIDI_FLAG_LIGATURE_RIGHT = 16;

line break <br>

BIDI_FLAG_LIGATURE_TRANSPARENT = 128;

zero-width joiner or non-joiner

BIDI_FLAG_MIRRORED = 8192;

start of a multichar (letter + non spacing marks, non spacing marks)

BIDI_FLAG_MULTICHAR_START = 4096;

this letter is to be combined to the right of previous letter

BIDI_FLAG_NON_SPACING_MARK = 512;

script is written from right to left (arabic, N'Ko...)

BIDI_FLAG_REMOVED = 1;

This item has no description.

BIDI_FLAG_RTL_SCRIPT = 256;

does not affect ligature

ubcNeutral = [ubcSegmentSeparator, ubcParagraphSeparator, ubcWhiteSpace, ubcOtherNeutrals];

This item has no description.

UNICODE_ARABIC_LETTER_MARK = $061C;

This item has no description.

UNICODE_ARABIC_TATWEEL = $0640;

arabic letters

UNICODE_COMBINING_GRAPHEME_JOINER = $034F;

This item has no description.

UNICODE_FIRST_STRONG_ISOLATE = $2068;

This item has no description.

UNICODE_FULLWIDTH_COMMA = $FF0C;

This item has no description.

UNICODE_HORIZONTAL_ELLIPSIS = $2026;

This item has no description.

UNICODE_IDEOGRAPHIC_COMMA = $3001;

horizontal line that makes a ligature with most letters ideographic punctuation

UNICODE_IDEOGRAPHIC_FULL_STOP = $3002;

This item has no description.

UNICODE_INFORMATION_SEPARATOR_FOUR = $001C;

data separators

UNICODE_INFORMATION_SEPARATOR_ONE = $001F;

record separator, kind of equivalent to paragraph separator

UNICODE_INFORMATION_SEPARATOR_THREE = $001D;

end-of-file

UNICODE_INFORMATION_SEPARATOR_TWO = $001E;

section separator

UNICODE_LEFT_TO_RIGHT_EMBEDDING = $202A;

characters that split into bidi sub-blocks (called "formatting")

UNICODE_LEFT_TO_RIGHT_ISOLATE = $2066;

equivalent of CRLF characters that split lines into top-level bidi blocks

UNICODE_LEFT_TO_RIGHT_MARK = $200E;

characters that mark direction without splitting the bidi block

UNICODE_LEFT_TO_RIGHT_OVERRIDE = $202D;

This item has no description.

UNICODE_LINE_SEPARATOR = $2028;

This item has no description.

UNICODE_MAX_BIDI_DEPTH = 125;

maximum nesting level of isolates and bidi-formatting blocks (char bidi level can actually be higher due to char properties)

UNICODE_NEXT_LINE = $0085;

equivalent of </p>

UNICODE_NO_BREAK_SPACE = $A0;

This item has no description.

UNICODE_PARAGRAPH_SEPARATOR = $2029;

equivalent of <br>

UNICODE_POP_DIRECTIONAL_FORMATTING = $202C;

This item has no description.

UNICODE_POP_DIRECTIONAL_ISOLATE = $2069;

This item has no description.

UNICODE_RIGHT_ANGLE_BRACKET = $3009;

This item has no description.

UNICODE_RIGHT_POINTING_ANGLE_BRACKET = $232A;

bracket equivalence

UNICODE_RIGHT_TO_LEFT_EMBEDDING = $202B;

This item has no description.

UNICODE_RIGHT_TO_LEFT_ISOLATE = $2067;

This item has no description.

UNICODE_RIGHT_TO_LEFT_MARK = $200F;

This item has no description.

UNICODE_RIGHT_TO_LEFT_OVERRIDE = $202E;

This item has no description.

UNICODE_ZERO_WIDTH_JOINER = $200D;

byte order mark

UNICODE_ZERO_WIDTH_NON_JOINER = $200C;

This item has no description.

UNICODE_ZERO_WIDTH_NO_BREAK_SPACE = $FEFF;

This item has no description.

UNICODE_ZERO_WIDTH_SPACE = $200B;

field separator, kind of equivalent to Tab zero-width