logo for information-management-architect.com
leftimage for information-management-architect.com

Data Architecture Best Practices

Improve effectiveness with David Bowman’s information management guidelines for data architecture best practices

This site is designed for Information Technology professionals who need to improve effectiveness and want guidance and direction to help teams consistently produce error free results.

It provides information management guidelines for data architecture best practices and roles and responsibilities.

Data Architecture
  • Should focus primarily on data modeling and project deliverables that flow from data modeling activities; and
  • Should provide direction on data architecture best practices including data model standards, modeling tools and the use of conceptual data model, logical data model and physical data model.
Data Architect
  • Should have overall responsibility for the data architecture solution and all data architecture deliverables including data modeling;
  • Should define data architecture requirements as part of requirements specification;
  • Should define source to target mapping and data transformation rules;
  • Should provide technical direction to data modelers, data analysts, data base administrators and metadata specialists; and
  • Should be responsible to the solution architect for the quality of all data architecture deliverables.
Data Architecture Best Practices For Data Models

A data model is a graphical means of documenting data requirements and defining database design specifications. It is used by data architects and data modelers to confirm understanding of data requirements with business owners and provide direction to technical teams who must ultimately built the database structures.

 Enterprise Conceptual Model
  • Should be created as part of an information management strategy; and
  • Should show what classes of information are required to support management and business intelligence purposes.
 Enterprise Logical Model
  • Should be created as part of the information management framework; and
  • Should eventually contain all entities and their relationships and a complete set of documentation. 
Application Logical Model
  • Should be created as part of project requirements analysis;
  • Should be based on the enterprise logical model; and
  • Should show what information is required to support the specific application. If new entities or definitions are discovered during project requirements analysis, they should be added to the enterprise logical data model.
Application Physical Data Models
  • Should be created as part of  architecture and design;
  • Should be derived from the logical data model;
  • Should be created for each environment that will be used by the application e.g. stage, data warehouse and data marts;
  • Should be used to generate data definition language (DDL) for each database;
  • Should specify the detail data base design; and
  • Should always be synchronized 100% with the database.
Application Dimensional Model
  • Should represent facts and dimensions, which are used for reporting and analytical purposes; and
  • Should be a physical model e.g. there is no logical representation of a dimensional model as it is a physical model that is sometimes shown in a logical version in the modeling tool with English like names as opposed to physical names.
Environment Models
  • Should be created for each data base environment e.g. stage, data warehouse, data mart; and
  • Should be used to generate the data definition language (DDL) used to create database schema's and objects for each environment.
Data Architecture Best Practices
  • Should segment data warehouse components such as data integration, reporting or analysis;
  • Should support the Information Technology subject area just like any other business function;
  • Should provide an optimal balance between data storage, data integration and data access e.g. the best structure for data storage is often a poor choice for data access and vice-versa;
  • Should support business requirements;
  • Should provide both audit metadata columns and processing metadata columns to facilitate error detection, error resolution, merge/purge activities, archiving, unloading, and change data capture; and
  • Should be readily available to everyone who needs them
Data Architecture Best Practices for Data Model Standard
  • Should be established so that conceptual, logical and physical data models are designed and structured according to the same rules and guidelines and are applied on a consistent basis; and
  • Should be the subject of project quality assurance checkpoints.
Enterprise Logical Data Model
  • Should be the the starting point for all data models; and
  • Should ensure that all conceptual, logical and physical data models are stored in an approved enterprise data model repository.
Data Modeling Notation

Data models should be designed and illustrated following an approved notation or methodology i.e. applying an approved notation or methodology on a consistent basis leads to easily read and understood data models of high integrity.

Approved Data Model Tools
  • Should be used for designing, building, maintaining and storing data models; and
  • Should be listed in a current version of an enterprise Technology Standards List.
Data Model Object Naming Conventions

Logical Model
  • Should have a standard for model naming;
  • Should use a singular noun phrase for subject area names;
  • Should incorporate and use subject areas in every logical model; and
  • Should provide a definition for each subject area
Entities
  • Should be something about which the business needs to record data;
  • Should have names that are a singular noun phrase that is not abbreviated e.g. customer, product, time, location, vendor, etc. and
  • Should contain a definition and example.
Attributes
  • Should be a named indivisible element of information about an entity;
  • Should describe the characteristics or properties of an entity and the information we want to know about an entity;
  • Should be singular noun phrases that are not abbreviated but are in business language terms; and
  • Should contain a definition and example.
Relationships
  • Should be defined for logical models;
  • Should define business rules by showing the interrelationships between two entities, not one indirectly defined through other entities;
  • Should be shown as lines between the entities;
  • Should be comprised of the entity names, the parent-to-child verb phrase based on the relationship type, and the cardinality e.g. Customer places one or many orders and One and only one customer places order
Volumetric Storage Parameters
  • Should be captured and stored in the logical model;
  • Should show initial volume;
  • Should show growth per month; and
  • Should show maximum volume.
Logical Data Model Guidelines
  • Should have no orphan entities;
  • Should define all entities;
  • Should define all attributes;
  • Should have a primary key identified for each entity;
  • Should identify alternate keys to aid in data access analysis and physical design;
  • Should have at least one “non-primary-key” attribute except for some associative entities;
  • Should identify all subtypes as “inclusive” or “exclusive”
  • Should specify a subtype discriminator for each subtype;
  • Should identify every attribute as either “required” or “not required” and the corresponding relationship should be specified as to whether nulls are either “allowed or not allowed”
  • Should specify every relationship as either “identifying” or “non-identifying”
  • Should specify whether nulls are allowed for each relationship;
  • Should specify a role name if there is more than one valid relationship involving the same entities or if the relationship is recursive;
  • Should assign a domain for every attribute;
  • Should document all known valid values; and
  • Should have a data type for each attribute
Data Architecture Best Practices For Physical Model

Tables
  • Should derive table names from logical entity names; and
  • Should include a definition and example.
Columns Names
  • Should be derived from logical attribute names; and
  • Should include a definition and example for each attribute.
Views
  • Should follow standards for table and column names for views not directly accessed by users;
  • Should be named using natural language, i.e. fully spelled out names for views directly accessed by users; and
  • Should apply standard abbreviations when the name exceeds a length supported by the database management system e.g standard abbreviations should be applied to reduce the name length and the recommended technique for shortening names is to apply standard abbreviations/acronyms from right to leave.
Physical Data Model Guidelines
  • Should have tables;
  • Should have columns;
  • Should have primary keys;
  • Should have foreign keys;
  • Should identify relationships as physical only if an entire table is physical only e.g. any relationships between one table and another table should also be marked as physical only; and
  • Should assign a domain for every column
Data Model Review Process

Logical Data Model Validation
  • Should be documented and approved for every logical model to ensure the model is complete, consistent and compliant with the standard statement;
  • Should identify entities without attributes;
  • Should identify attributes without definitions;
  • Should identify entities without primary keys;
  • Should identify entities without definitions; and
  • Should identify entities without relationships.
Physical Data Model Validation Process
  • Should be documented and approved for every physical model to ensure the model is complete, consistent and compliant with the standard statement;
  • Should identify columns without comments;
  • Should identify tables without primary keys;
  • Should identify redundant indexes;
  • Should identify tables without columns;
  • Should identify columns with default data types; and
  • Should identify columns with different data types.
Summary...

Data Architecture should focus primarily on data modeling and project deliverables that flow from data modeling activities and should provide direction on data architecture best practices including data model standards, modeling tools and the use of conceptual data model, logical data model and physical data model.

This site provided information management guidelines for data architecture best practices and  roles and responsibilities.