March 10, 2026

Unstructured to Structured Postal Address Transformation for CBPR+ SR 2026

The Challenge

Transforming a free-text address like "123 Calle Mayor, 28001 Madrid, Spain" into structured fields seems straightforward. But real-world banking data is messy:

  • Disordered components: "SPAIN MADRID CALLE MAYOR 123 28001" — country and city at the start, not the end.
  • Country names in streets: "15 Rue de France, 06000 Nice" — France is part of the street name, not the country.
  • Ambiguous city names: "Portland" exists in both the US and UK. "Valencia" exists in Spain and Venezuela.
  • City-states: "Singapore" is both a country and a city.
  • Missing separators: "JOHN DOE 123 HIGH STREET LONDON EC1A 1BB" — no commas, no clear boundaries.

Multi-Pass Resolution

The key insight is that address parsing can't be done in a single linear pass. A multi-pass approach works better:

Pass 1 — Authoritative Country: Check IBAN prefix, agent BIC, clearing system codes. These are highly reliable.

Pass 2 — Text Scan: Extract ALL candidates from the text simultaneously — country names, city names, postcodes, post boxes.

Pass 3 — Cross-Check: Use the candidates to confirm each other. If "Nice" (a city in France) is found alongside "FR" (a country code), they confirm each other. If "France" appears in "Rue de France" but "FR" is also present as an isolated code, the isolated code wins.

Pass 4 — Assign Components: Build the residual string (everything not claimed by country, city, or postcode) and assign it to StrtNm and BldgNm.

Confidence Scoring

Not every transformation is equally reliable. A confidence score (0-100) determines the outcome:

  • High confidence (≥70): Auto-transform. The engine is confident enough to proceed without human review.
  • Low confidence (<70): Route to a review queue where an operator can verify and correct.

The scoring model considers the source quality (IBAN is more reliable than text extraction), cross-validation between components, and the number of resolved fields.

PostalAddress24 Output

The ISO 20022 PostalAddress24 type has strict field lengths:

FieldMax LengthDescription StrtNm70 charsStreet name BldgNb16 charsBuilding number BldgNm35 charsBuilding name PstCd16 charsPost code TwnNm35 charsTown name Ctry2 charsISO 3166-1 alpha-2 AdrLine70 chars × 2Free text (max 2 lines)

The transformation engine must respect these limits while preserving as much address information as possible.

Ready to see PostalIQ in action? Request a demo →