TokenTrieWalker
in package
Walks a provided identifier to try and find one or more matching components.
Tags
Table of Contents
Properties
- $delimitersHandler : mixed
- $deviceData : mixed
- $isLowerCaseData : mixed
- $rootNode : mixed
Methods
- __construct() : mixed
- seekComponents() : mixed
- Loop over the input string character by character and descend down a Trie structure to find matching Components.
- addMatchCandidates() : mixed
- Check if any of the match candidates pass the checks and add to cFound if so
- findNextCharactersMatch() : mixed
- The identifier can contain more than one character in the Node, so this function tries to find a match, if found, the character matched is returned along with the additional character position.
- getDirectChildNode() : mixed
- getMatchCandidates() : mixed
- getTokenStartNode() : mixed
- hasMatchCandidates() : mixed
- hasTokenStartNode() : mixed
- isSwAllowedAnyMatch() : mixed
Properties
$delimitersHandler
private
mixed
$delimitersHandler
$deviceData
private
mixed
$deviceData
$isLowerCaseData
private
mixed
$isLowerCaseData
$rootNode
private
mixed
$rootNode
Methods
__construct()
public
__construct(mixed $deviceData) : mixed
Parameters
- $deviceData : mixed
seekComponents()
Loop over the input string character by character and descend down a Trie structure to find matching Components.
public
seekComponents( $identifier) : mixed
The Trie is a shallow Trie created from token values. It expects either a full match to a token from an incoming identifier or a partial stats-with match if permitted by a found MatchCandidate.
Although the individual characters(*) are used to walk the Trie it is easier to think of the input string as a collection of tokens delimited by characters like spaces, semi-colons etc.
A given branch of the Trie may either be formed from a single token or formed from a chain of adjacent tokens.
At the end of a token, a matching Node may contain some MatchCandidates. These MatchCandidates contain a Component and some additional criteria that must be passed before the Component is added to the "found" collection.
(* in practice the codePoint is used to walk the trie and not the actual character itself)
Note1: See flow diagram in internal-docs/tokentriewalk.drawio.png
NOTE2: Some attempts were made to split the below method into multiple methods to add readability. The overall API performance dropped by around 40-50,000 detections per second with this change so it was reverted.
Parameters
addMatchCandidates()
Check if any of the match candidates pass the checks and add to cFound if so
private
addMatchCandidates( $matchResult, $candidates, $endTokenCharPosition, $tokenPosition, $atEndOfToken) : mixed
Parameters
- $matchResult :
-
IdentifierMatchResult The collection to add found components to
- $candidates :
-
array The possible candidates to add to cFound
- $endTokenCharPosition :
-
int
- $tokenPosition :
-
int The current token position from the identifier
- $atEndOfToken :
-
boolean True if the Trie walk used the full token from the identifier. False otherwise.
findNextCharactersMatch()
The identifier can contain more than one character in the Node, so this function tries to find a match, if found, the character matched is returned along with the additional character position.
private
findNextCharactersMatch( $node, $identifier, $character, $identifierCurrentPosition) : mixed
Parameters
getDirectChildNode()
private
getDirectChildNode(mixed $node) : mixed
Parameters
- $node : mixed
getMatchCandidates()
private
getMatchCandidates(mixed $node) : mixed
Parameters
- $node : mixed
getTokenStartNode()
private
getTokenStartNode(mixed $node) : mixed
Parameters
- $node : mixed
hasMatchCandidates()
private
hasMatchCandidates(mixed $node) : mixed
Parameters
- $node : mixed
hasTokenStartNode()
private
hasTokenStartNode(mixed $node) : mixed
Parameters
- $node : mixed
isSwAllowedAnyMatch()
private
isSwAllowedAnyMatch(mixed $node) : mixed
Parameters
- $node : mixed