Skip to main content
The matching engine runs two parallel tracks — fuzzy name matching and semantic matching — and combines them into a single final score.

Track 1 — Fuzzy Name Matching

Before any comparison happens, names go through several preparation steps. Cultural Affinity Detection A machine learning model identifies the cultural origin of a name (Russian, Chinese, French, Arabic, etc.) and applies the appropriate matching logic. For example, “John Smith” and “Smith John” are treated as equivalent under Western naming conventions — that assumption does not apply to Chinese names. Normalization Names are stripped of special characters and converted to a standard format. “Timothée Dupont-Giguère” becomes “TIMOTHEE DUPONT GIGUERE”. Legal entity suffixes such as “Ltd.” or “N.P.L.” are also standardized. Tagging and Weighting Each part of a name is assigned a semantic role: FIRSTNAME, LASTNAME, MIDDLENAME, ABBREVIATION, and so on. A name like “Mugabe, R G” is parsed so that “Mugabe” is tagged as LASTNAME, “R” as both FIRSTNAME and ABBREVIATION, and “G” as MIDDLENAME/ABBREVIATION. These tags determine how much weight each part carries in the final score — surnames count more than middle names. Candidate Selection and Scoring The engine searches the watchlist for potential matches, tolerating a wide range of variations:
  • Inverted or doubled letters, missing letters, similar-sounding letters across languages
  • Split or merged words
  • Common aliases (Robert / Bob)
  • Patronyms, teknonyms, and abbreviations
  • Full transliterations from non-Latin scripts
Each name token is scored individually. Exact matches score 100; fuzzy matches score 75–99 depending on closeness. Missing or unexpected tokens apply downward penalties, and surname mismatches carry a heavier penalty than middle name mismatches.

Track 2 — Semantic Matching

Running in parallel, this track covers everything that is not a name: date of birth (compared by year, month, and day with configurable tolerance), country, gender, and other identifiers. Each field is scored based on how reliable and relevant it is to the potential match.

Final Stage — Metascore

Both tracks feed into the Metascore, which combines everything into a single number. The weights are configurable based on your data quality:
  • A reliable country match can boost the score by 15–25 points
  • A DOB mismatch applies roughly a 20-point penalty
  • A weak name match (e.g. 89) can be pushed over the threshold if DOB and country both align
  • A perfect name match (100) can be suppressed to ~82 if other data points contradict the watchlist record
This two-directional adjustment keeps false positives low without sacrificing real hits. The final threshold for what counts as a match is configurable per request.
If you are seeing an unexpected number of false positives or missed matches, the match threshold and birth-year tolerance are the first settings to review. See AML Key Terms and Concepts for a full breakdown of available filters.