logo for information-management-architect.com
leftimage for information-management-architect.com

Database Mapping

Need to evolve database mapping requirements for an information management requirements specification and want practical timesaving suggestions?

Data mapping is a key component of an information managementDatabase Mapping requirements specification, which specifies business transformation rules that should be applied to source data.

What is the data requirements analysis process?

The following summarizes the key steps:
  • The data modeler gains understanding of the data requirements and documents this in the form of a logical data model, which confirms the required structure and design;
  • The data analyst gains understanding of each of the source systems that contain the required information and documents this in the form of a data movement model. They also document interface requirements by identifying any existing or new interfaces needed to support the data requirements;
  • The data analyst examines data quality and prepares a data profile documenting key quality issues that will need to be handled; and
  • The data-mapping document finalizes the process.
What is a database mapping document?

A data mapping document is frequently called a source to target map and is sometimes completed in a spreadsheet or in an automated mapping tool. For each source system and interface, it specifies:
  • Source input definition;
  • Target/output details;
  • Business and data transformation rules;
  • Absolute data quality requirements; and
  • Optional data quality requirements.
The mapping document is completed in the requirements analysis data
movement analysis phase. It is a living document that is used as input to the architecture and design phase. It is intended to capture business rules, data flow mapping and data movement requirements.

The following provides more details on the contents of a database mapping document:

Source input definition specifies:
  • The requirement id that is the basis for the mapping (This is to help ensure that we are not mapping data that is not required);
  • The interface id, or input file number, that is the basis for the mapping (This is a direct reference to the interface id on the data movement model and helps ensure that we have not missed any interfaces);
  • Source input file name, which identifies the interface name;
  • Input field name, which identifies the interface column name or field name;
  • Data type and field length, which is  the technical description of the column or field name e.g. Numeric(4), Char(25), Varchar(16) Date; and
  • Field format, which is required for date and numeric fields e.g. ddmmyyyy, dd/mm/yy, mon-yy, 9999.99, 99999.
Target output specifies:
  • The target entity number (each entity on the logical data model should be identified with a unique number);
  • Entity name;
  • Attribute name;
  • Attribute definition, this should be the same definition that is documented in the logical data model and should also be included in the metadata repository;
  • Attribute data type;
  • Attribute format, which is required for date and numeric fields e.g. ddmmyyyy, dd/mm/yy, mon-yy, 9999.99, 99999; and
  • Nulls allowed, indicates if null values are allowed in this attribute.
Transforms specifies:
  • Mapping transformations or business rules e.g.  "straight mapping", "sysdate + 1", "if source value is null then set to 7", "If value is 999999999 populate field with 01/01/2000";
  • Track changes thru history, specifies how changes to this attribute should be handled e.g. should the data be overwritten, a new record be created, or a current record value be updated?
  • Referential integrity, identifies referential integrity considerations e.g. key value must exist in key column of table xxx, do not populate if column s is null.
Absolute data quality requirements

This specifies data quality requirements that must be met including:
  • Upper value limit, specifies the maximum allowable value for the target;
  • Lower value limit, specifies the minimum allowable value for the target; and
  • Valid values list, specifies a valid list of values for the target.
Optional data quality requirements

This specifies optional data quality requirements that should be met including:
  • Valid value look-up, e.g. if a lookup file or table is used to validate the values for this target, specify the source of this lookup.  Values not contained in this file or table will be rejected;
  • Default value, if nulls are not allowed and source field is null what default value should be substituted to prevent this record from rejecting?
  • Comments/questions—is used during data mapping to record questions/concerns. These should all be addressed before the requirements specification is complete

Sounds like a lot of detail

Yes it is. Information management projects are data centric and most of the requirements are data related—It is very important  that these requirements be captured early in the project to ensure no misunderstanding.

Summary…

Information management projects are data centric projects. Database mapping is a key component of requirements analysis and needs to be completed by an experienced data analyst early in the project.




footer for Information management page