Database Mapping
Need
to evolve database mapping requirements for an information management
requirements specification and want practical timesaving suggestions?
Data mapping is a key component of an information management
requirements specification,
which specifies business transformation rules that should be applied to
source data.
What is the
data requirements analysis process?
The following summarizes the key steps:
- The
data modeler gains understanding of the data requirements and documents
this in the form of a logical data model, which confirms the required
structure and design;
- The data analyst gains understanding of
each of the source systems that contain the required information and
documents this in the form of a data movement model. They also document
interface requirements by identifying any existing or new interfaces
needed to support the data requirements;
- The data analyst
examines data quality and prepares a data profile documenting key
quality issues that will need to be handled; and
- The data-mapping document finalizes the process.
What is
a database mapping document?
A
data mapping document is frequently called a source to target map and
is sometimes completed in a spreadsheet or in an automated mapping
tool. For each source system and interface, it specifies:
- Source input definition;
- Target/output details;
- Business and data transformation rules;
- Absolute data quality requirements; and
- Optional data quality requirements.
The
mapping document is completed in the requirements analysis
data
movement analysis phase. It is a living document that is used as input
to the architecture and
design phase. It is intended to capture business rules, data flow
mapping and data movement requirements.
The following provides more details on the contents of a database
mapping
document:
Source input
definition specifies:
- The requirement id that is the basis for the
mapping (This is to help ensure that we are not mapping data that is
not required);
- The
interface id, or input file number, that is the basis for the mapping
(This is a direct reference to the interface id on the data movement
model and helps ensure that we have not missed any interfaces);
- Source input file name, which identifies the
interface name;
- Input field name, which identifies the
interface column name or field name;
- Data
type and field length, which is the technical description of
the
column or field name e.g. Numeric(4), Char(25), Varchar(16) Date; and
- Field format, which is required for date and
numeric fields e.g. ddmmyyyy, dd/mm/yy, mon-yy, 9999.99, 99999.
Target
output
specifies:
- The target entity number (each entity on the
logical data model should be identified with a unique number);
- Entity name;
- Attribute name;
- Attribute
definition, this should be the same definition that is documented in
the logical data model and should also be included in the metadata
repository;
- Attribute data type;
- Attribute format, which is required for date
and numeric fields e.g. ddmmyyyy, dd/mm/yy, mon-yy, 9999.99, 99999; and
- Nulls allowed, indicates if null values are
allowed in this attribute.
Transforms
specifies:
- Mapping
transformations or business rules e.g. "straight mapping",
"sysdate + 1", "if source value is null then set to 7", "If value is
999999999 populate field with 01/01/2000";
- Track changes thru
history, specifies how changes to this attribute should be handled
e.g. should the data be overwritten, a new record be created,
or a
current record value be updated?
- Referential integrity,
identifies referential integrity considerations e.g. key value must
exist in key column of table xxx, do not populate if column s is null.
Absolute data
quality requirements
This specifies data quality requirements that must be met
including:
- Upper value limit, specifies the maximum
allowable value for the target;
- Lower value limit, specifies the minimum
allowable value for the target; and
- Valid values list, specifies a valid list of
values for the target.
Optional
data
quality requirements
This specifies optional data quality requirements that should be met
including:
- Valid
value look-up, e.g. if a lookup file or table is used to validate the
values for this target, specify the source of this lookup.
Values
not contained in this file or table will be rejected;
- Default
value, if nulls are not allowed and source field is null what default
value should be substituted to prevent this record from rejecting?
- Comments/questions—is
used during data mapping to record questions/concerns. These should all
be addressed before the requirements specification is complete
Sounds like
a
lot of detail
Yes
it is. Information management projects are data centric and most of the
requirements are data related—It is very important that these
requirements be captured early in the project to ensure no
misunderstanding.
Summary…
Information
management projects are data centric projects. Database mapping is a
key
component of requirements analysis and needs to be completed by an
experienced data analyst early in the project.
|