Track 1 — Fuzzy Name Matching
Before any comparison happens, names go through several preparation steps. Cultural Affinity Detection A machine learning model identifies the cultural origin of a name (Russian, Chinese, French, Arabic, etc.) and applies the appropriate matching logic. For example, “John Smith” and “Smith John” are treated as equivalent under Western naming conventions — that assumption does not apply to Chinese names. Normalization Names are stripped of special characters and converted to a standard format. “Timothée Dupont-Giguère” becomes “TIMOTHEE DUPONT GIGUERE”. Legal entity suffixes such as “Ltd.” or “N.P.L.” are also standardized. Tagging and Weighting Each part of a name is assigned a semantic role: FIRSTNAME, LASTNAME, MIDDLENAME, ABBREVIATION, and so on. A name like “Mugabe, R G” is parsed so that “Mugabe” is tagged as LASTNAME, “R” as both FIRSTNAME and ABBREVIATION, and “G” as MIDDLENAME/ABBREVIATION. These tags determine how much weight each part carries in the final score — surnames count more than middle names. Candidate Selection and Scoring The engine searches the watchlist for potential matches, tolerating a wide range of variations:- Inverted or doubled letters, missing letters, similar-sounding letters across languages
- Split or merged words
- Common aliases (Robert / Bob)
- Patronyms, teknonyms, and abbreviations
- Full transliterations from non-Latin scripts
Track 2 — Semantic Matching
Running in parallel, this track covers everything that is not a name: date of birth (compared by year, month, and day with configurable tolerance), country, gender, and other identifiers. Each field is scored based on how reliable and relevant it is to the potential match.Final Stage — Metascore
Both tracks feed into the Metascore, which combines everything into a single number. The weights are configurable based on your data quality:- A reliable country match can boost the score by 15–25 points
- A DOB mismatch applies roughly a 20-point penalty
- A weak name match (e.g. 89) can be pushed over the threshold if DOB and country both align
- A perfect name match (100) can be suppressed to ~82 if other data points contradict the watchlist record