Informatica MDM Load and Match Process – Tutorial

Load Process:

Load process is a two-step process:

  • Apply Updates.
  • Apply Inserts.

Updates

  • Load job applies updates for existing records whose.

LAST_UPDATE_DATE (Staging table) > SRC_LUD (XREF table).

  • The update process always updates the XREF table record.
  • The update process may update the Base Object depending on trust:
    • For columns not flagged for trust, update happens if incoming data has new LUD.
    • For columns flagged for trust, load job compares trust weightings of staging table data to trust weightings of existing data in base object to determine what can be updated.
  • If history flag is switched on for the Base Object, then the update process writes to the history tables of Base Object and XREF.

Inserts

  • Load job applies inserts for records that do not exist in the XREF table.
  • ROWID_OBJECT values are generated for the new records.
  • New records are inserted into a base object and XREF with CONSOLIDATION_IND = 4.
  • If history flag is switched on for the Base Object, then the insert process writes to the history tables of Base Object and XREF.

Rejects

  • Referential Integrity is maintained among base objects in the consolidated data model.
  • Rejects will occur in the load process if any records violate the RI constraint.
    • Parent records do not exist.
    • Child records are loaded before the parent records.
    • Lookup has been defined incorrectly.
  • Rejected records are inserted in the reject table of the Staging table.

staging_table_name_REJ

Match Process:

Objectives:

Following are the objectives of this topic:

  • Match & Merge Overview.
  • Match Rules Configuration.
  • Exact Match/Search Strategy.
  • Fuzzy Match/Search Strategy.
  • Match Server Architecture.

Match & Search Strategy:

Match Process:

Match & Merge Overview:

Challenges with identifying duplicate records.

  • Misspellings, typing, and transcription errors.
  • Nicknames.
  • Synonyms.
  • Abbreviations.
  • Foreign and Anglicized words.
  • Prefix and suffix abbreviations.
  • Concatenation or splitting of words.
  • Noise words and punctuation.
  • Casing and character set variations.
  • To merge or link records, MRM needs to know which records are likely duplicates of each other.
  • Match rules tell MRM how to identify likely duplicates.
  • Match rules also tell MRM if two matching records are similar enough to automatically merge/link, or if they should be reviewed by a data steward.

Data Consolidation Options.

  • Merging (merge-style base objects) physically combines the matched records in the base object. Makes the most-current best version of the truth (BVT) available.
  • Linking (link-style base objects) quickly determines the BVT without physically combining the records. Provides much faster overall throughput.

Match/Search Strategy:

Exact

  • Does not allow for any variations in the data in the match columns.
  • Very simple match process, therefore fast.

Fuzzy

  • Allows for variations in spelling, formats, word order, nicknames, synonyms, etc.
  • More complex match process, therefore slower.

High-level process flow for the match process:

Match Path

  • A Match Path represents the base object which will provide data for matching purpose.
  • Traverse the hierarchy between records across multiple base objects or within a single base object.
  • Foreign Key Relationships between tables are used to traverse the relationships.
  • Parent-to-child or child-to-parent relationships can be specified.

Match Path – Check for Missing Children

  • By default, MDM does an inner join between the base objects defined in the Match Path.
  • The join, therefore, excludes rows that don’t have corresponding rows in the joined tables.
  • To include those records, switch on “Check for Missing Children” – MDM will then do an outer join instead of an inner join.

Get Free Informatica MDM Materials
Match Path – Inter Table

Match Path – Intra Table

Match Column:

  • A match column contains an identifying characteristic of the base object to be consolidated
  • Each base object can have multiple match columns
  • Examples:

1. Full Name.

2. Generation

3. Address

4. Phone

5. Email

  • Provider column(s) is the base object columns that provide the data for the match column:
  • Can be a single column or a concatenation of columns
  • Must be a VARCHAR / CHAR column to concatenate
  • Date column is also supported for matching

Each match column is based on one or more columns

  • From base object.
  • Or from X-ref (in some cases).
  • Or from child base object (in some cases).

Exact Match/Search Strategy

Steps for defining Exact Match Rules

  • Select Match/Search Strategy = Exact
  • Define Match Path
  • Define Match Columns
  • Create at least one Match Rule Set
  • Create Match Rules for Match Rule Set(s)

Match Columns

  • A match column contains an identifying characteristic of the base object record to be consolidated
  • Exact Match Columns:
    • Does not make allowance for any variations in data content
    • Records will match if they have identical values in the match columns used in match rules

Match Rule & Match Rule Set

Match Rules are grouped into Match Rule Sets.

  • Can define multiple rule sets.
  • Only one match rule set can be active at any point in time.

Match rule defines the combination of columns that constitute a match.

Match Rule – Auto property

  • Match rules are flagged either for auto merge/auto link or for manual merge/link.
  • Matches resulting from auto merge/auto link rules will result in the records being automatically merged/linked to the system when the auto merge/auto link batch job runs.
  • Matches resulting from manual merge/link rules will be queued for review by a data steward.

Match Rule & Match Rule Set

  • Every checked Match Column adds an “And” condition.
  • Every new Rule is an “Or” condition.
  • Net result of match is that the same 2 records will only match once, on the first match rule that they match on.

Match Rule – Null Matching

  • By default, NULL is not regarded as being the same as NULL.
  • NULL Matches NULL: Use this flag to specify the match columns in a match rule that should be regarded as matches even if the 2 values being compared are both NULL.
  • NULL Matches non-NULL: Use this flag to specify the match columns in a match rule that should be regarded as matches when one of the values being compared is NULL and the other is not.

Match Rule – Non-Equal Matching

  • Specifies that 2 records are a match if they do not have the same values in the non-equal match column.
  • Reverses whatever would/would NOT have matched without Non-equal match.
  • If using non-equal match, then MUST switch on Validate Matches property in Base Object Advanced Properties.

Match Rule – Segment Matching

  • Allows a match rule to be limited to a specific subset of data.
  • Different match rules can use different segment values.

Match Rule

Fuzzy Match/Search Strategy:

Steps for defining Fuzzy Match Rules

  • Select Match/Search Strategy = Fuzzy
  • Choose a Population
  • Define Match Path
  • Define Match Key
  • Define Match Columns
  • Create at least one Match Rule Set & choose Search Level
  • Create Match Rules for Match Rule Set(s)

Population

  • Population is intended to addresses the name distribution problem.
  • Common family names in each population skew the data and query performance.

E.g. Smith, Williams in English-speaking populations.

  • Each population also has a large number of uncommon names that tend to have the most error and variability.
  • Match needs to account for both of these situations in the way that the keys are built, to give optimal search performance for both.
  • Defines how to identify matches within a particular population and language.
  • Defines how to build keys and perform searches on name and address.
  • Supports a specific set of match purposes.

Match Key

  • Match key is used to search for match candidates.
  • It is a fixed-length, compressed, and encoded value.
  • Built from a combination of the words and numbers in a name or address.
  • For one name or address, multiple SSA match keys are generated.
  • Match Key Properties:
    • Key Type.
    • Key Width.
    • Path Component.
    • Match Column Contents.

Match Key – Key Type

  • The match key type describes important characteristics about a column to MDM Hub.
  • Should be based on the main identifying data in your base object.
  • For standard population, the options are:

Match Key – Key Width

  • Determines the degree of variance that will be supported in the key values.
  • Represents tradeoff between match precision and the space used by match key records.

Match Key – Path Component

  • Contains the column that forms the basis for defining the Match Key.
  • Can be any table defined in the Match Path.

Match Key – Match Column Contents

  • The column(s) from Path Component that provides data to the Match Key.

Match Key – Example

  • Key Type = Organization_Name;
  • Key Width = Standard;
  • Path Component = Customer
  • Match Column Contents = Full_Name

Match Key:

Match Column

  • A match column contains an identifying characteristic of the base object record to be consolidated.
  • Can be a fuzzy column or an exact column.
  • Fuzzy Match Column.
    • The column name you choose defines the type of data that the match expects that column to contain.
    • Examples: Person Name, Address Part 1, Address Part 2, etc.
  • Exact Match Column.
    • Acts as a filter in the match.
    • Can have additional properties when used in match rules like Null match, Non-equal match, and segment match.

Match Column:

Match Rule Set

  • They are logical grouping of Match Rules that collectively act on a base object for identifying duplicates.
  • Multiple rule sets can be defined for a base object.
  • Only one rule set can be active at any point in time.
  • Each rule set has a Search Level and can comprise of one or more Match Rules.

Match Rule Set – Search Level

Determines how many match candidates are returned in the search phase of match process.

Match Rule Set – Search Level Examples

Key Type = Organization_Name; Key Width = Standard;

Record to be Matched = “ELIZABETH S O’BRIAN”

Match Rule Set:

Match Rule

  • Determines what constitutes a match during match process
  • Fuzzy Match Rule Properties:
    • Match Purpose
    • Match Level
    • Accept Limit Adjustment

Match Rule – Match Purpose

  • Determines the fields that will be used in the match
    • Different fields are required fields for different purposes
    • There are also optional fields for each purpose that can help improve the match
  • Determines the importance accorded to each field

Match Rule – Match Level

  • Determines how precise the match is i.e. how similar a candidate record is to the queued record to be considered a match
  • Supported match levels are:
    • Conservative:    Tight Matching
    • Typical: Appropriate for most matches
    • Loose:   Allows more variance in the values being matched

Match Rule – Accept Limit Adjustment

  • Determines the acceptability of a match for the specified match level.
  • The Accept Limit Adjustment allows a coarse adjustment to what is considered to be a match for this match rule:
    • A positive adjustment results in tighter matching.
    • A negative adjustment results in looser matching.

Match Rule:

Match Rule – Syntax Used in Rule Description

Match Server Architecture

Match server is multi-threaded

  • Can configure how many threads MDM Hub will create for matching.
  • If not configured, 4 threads will be created regardless of the number of CPUs on the machine.

Multiple match servers can be configured

  • Allows match jobs to be run in parallel. A single match job is not load balanced across multiple match servers.
  • MDM Hub will assign match jobs to available match servers on a round robin basis

Merge Process:

Objectives

Following are the objectives of this topic:

  • Configure Merge Settings
  • Describe Immutable Source Systems
  • Describe Distinct Systems
  • Describe the Un-Merge Process

Merge Process

Merge

  • Consolidation process of two matched records in the Base Object
  • Merge can be Auto-Merge or Manual-Merge depending on the degree of matching.

Immutable Source Systems

  • An immutable source means that the source system is seen as a distinct source
  • All records coming from this source always have a consolidation indicator of 1
  • If two immutable records must be merged, then a data steward needs to perform a manual verification in order to allow that change. The  data steward will have to choose the key that remains

Distinct Systems

  • Records from source marked as Distinct will not merge amongst themselves

Un-Merge Process

  • By default, unmerging parent records does not unmerge associated child records
  • Unmerge Child When Parent Unmerges option allows you to specify what happens if records in the parent base object are unmerged
  • Pre-Requisites for enabling this option are:
    • The parent-child relationship must already be configured in the child base object.
    • The foreign key column in the child base object must be a match-enabled column.

From this blog you have gone through Match process and to have knowledge in Informatica MDM Batch Viewer.

Enroll for Informatica MDM Training Course

Nitesh

Nitesh

Author

Bonjour. A curious dreamer enchanted by various languages, I write towards making technology seem fun here at Asha24.