logo for information-management-architect.com
leftimage for information-management-architect.com

Data Warehouse Security

Improve protection with David Bowman’s information management guidelines for data warehouse security

This site is designed for Information Technology professionals who need to improve security and require guidance and direction to help teams consistently implement appropriate security protection.

It provides information management guidelines for data warehouse security.

What Does Security Involve?

A data warehouse has a lot of moving parts and a lot of opportunities for security issues e.g. data resides in a source system, and can be accessed by authorized users. It is extracted to a staging area and transformed and possibly moved to a clearinghouse and then loaded to a data warehouse and extracted from the data warehouse and loaded into a data mart where end users can access it for analytic or reporting purposes.
Each stage of the data movement process involves a certain data warehouse security risk.

Source System Extract

Some organizations have very stringent requirements. They will not allow external systems to access source data directly e.g. we may have a requirement to extract data from a credit card transaction processing system but the policy does not permit this. Instead, an extract file must be created by the source system and “pushed” to the staging area. The requirements specification should reference this security policy.

Staging Area

Staging areas are usually a “landing” spot for source data extracts and this is one place where all data is available for a brief period of time.

Requirements should define access rights, if any, and should also include requirements for data disposal so that sensitive data is protected after it leaves the staging area.

Staging areas and/or clearinghouses may also store data that is rejected for quality issues. Sometimes this data remains in the staging area until it is corrected by a new feed and sometimes it is corrected by a data steward. In all cases, security requirements must be defined.

Data Warehouse

Some organizations let users make ad-hoc data extracts directly from the data warehouse and others only allow approved processes to extract data. Access privileges must be specified.

There may be a requirement to aggregate sensitive data e.g. suppose we have a data warehouse which includes health care prescriber personal data. We decide to integrate an external data source which identifies all prescriptions issued within a certain sample area. Although the "prescriptions written” data does not identify the care giver or the patient, it might be possible to determine this if the sample area is very small e.g. one prescriber and two patients. In this case, we may need to specify a requirement to aggregate data if the sample area falls below a certain threshold.

Data Marts

Requirements need to specify which users have access to data and any data aggregation security specifications.

Other Data Warehouse Security Considerations
  • Should consider requirements for other sensitive corporate data e.g. human resource data;
  • Should consider test data requirements e.g a need to “black-out” sensitive data during testing;
  • Should consider security concerns for project documentation e.g. are detail design specifications of value to competitors and if so, what are the requirements to prevent unauthorized access; and
  • Should ensure that the requirements specifications refer to the information management data security policy to ensure common understanding of all data warehouse security requirements.
Data Warehouse Security Guidelines

Authentication
  • Access to the network and to each system should be controlled through use of individually owned user accounts and associated confidential authentication key or password;
  • A formal record should be maintained of all access rights, including complete user or account names and group descriptions;
  • Accounts that become inactive or unused should be suspended after sixty (60) days, and, if they remain inactive, deleted after ninety (90) days;
  • Temporary user accounts may have to be set up, e.g., for test purposes. Such accounts should have an expiration date;
  • Users shall be provided, initially and on a reset, with a temporary password that they are required to change immediately. In order to ensure calls for password reset are valid, user's identities should be verified using information about the user that only the user would know. Temporary passwords should be conveyed to users in a secure manner;
  • Unneeded or unsecured special accounts should be restricted or removed e.g. Guest, Anonymous, Null, and non-user accounts;
  • The logon procedure should not request or display information during the logon procedure that would aid an unauthorized user;
  • Logon information shall be validated only on completion of all input data. If an error condition arises during login, the system shall not indicate which part of the data is correct or incorrect;
  • The number of grace logins should be set to a maximum of six (6) notices. A grace login allows the user to delay changing their password, and to log in for a fixed number of times, after which they are required to change their password, and cannot proceed to log in until they do;
  • The number of unsuccessful logon attempts should be limited to five (5) logins. After 5 consecutive unsuccessful logon attempts: 1) record the unsuccessful attempt; and 2) inactivate the account for an automated timeout/reset period of 30 minutes or greater, or in a manner that requires a manual reset by the system administrator; and
  • User IDs should not give any indication of the user’s privilege level e.g., manager nor the application system to which they have access.
Data Warehouse Security Passwords
  • Passwords should have a minimum length of eight (8) characters;
  • Passwords should have a mix of alpha and numeric characters;
  • Passwords should not contain more than two consecutive identical characters;
  • Passwords should not contain any control characters e.g., Ctrl-C or blank spaces, which can allow for code/ fault injection;
  • Passwords should not be reused for at least six (6) generations pf consecutive changes;
  • Passwords should be changed when the system prompts, or at least every sixty (60) days if the system does not prompt for a change; Applications that utilize two-factor authorization are not required to expire on a pre-defined schedule;
  • Passwords should not be easily guessed by others or through use of automated tools;
  • Passwords should be stored encrypted, hashed, or with access controls;
  • Password files should be stored separately from the main application system data;
  • Default passwords should be changed following installation of software and patches;
  • An effective password management system or equivalent password management methodology should be used to authenticate users; and
  • All passwords generated on behalf of an individual user should conform to password management standards.
Applications
  • Users of a system should not have unauthorized access to other user’s data;
  • Passwords should not be stored unencrypted on disk, in computer memory, or in any system-based data repository, e.g., the NT Registry;
  • Passwords should not be embedded in macros, scripts, job control language, programs, or files, unless they have been stored encrypted, hashed, or with access controls;
  • Passwords should not be displayed in clear text on the screen when being entered;
  • Application and system output that contains confidential or proprietary data should be routed only to authorized terminals and locations;
  • Initiation scripts and aliases, commands should be executed using fully-qualified command names, for example, full path names;
  • Using the full path name for a command can prevent the execution of malicious code residing in a local directory;
  • Screen savers should be set to activate after a period of fifteen (15) minutes of user inactivity, and should be password protected;
  • Active application sessions should be terminated, unless they can be secured by a screen lock or other protection;
  • Logon credentials should not be cached on the system;
  • Any screen/web page requiring a password entry should be configured to prevent the caching of the entered password;
  • Information should not be transmitted to cell phones in clear text;
  • Though pager messages are hard to intercept, use appropriate caution when transmitting company data;
  • Applications should have controls to validate the integrity of data prior to be used as input to the application;
  • Applications should have controls to validate the correct processing of data to detect data integrity errors caused by processing errors or malicious acts;
  • Applications should have controls to validate the integrity of electronically transmitted data to detect corruption or unauthorized changes;
  • Applications should have controls to validate the integrity of processed or stored data;
    Passwords should not be sent to the user via email unless the password is encrypted using an approved encryption tool such as Entrust;
  • Passwords should be encrypted during transmission;
    Browser based applications should use 128-bit encryption if it exchanges confidential data with the browser; and
  • Browser based applications that require SSL should restrict access to browsers capable of supporting 128-bit, or higher, encryption.
Environments
  • Developers who need to access production systems and applications for program maintenance or repair should be given temporary access, which should be revoked immediately after use;
  • Development and testing should be conducted on non-production machines;
  • There should be separation of development, QA and operational system environments; and
  • Production data should not be used for testing or training purposes e.g. if production data is used as a starting point for creating test data, then appropriate controls should be in place to protect this data, and the data should be effectively depersonalized or altered to manage risk.
Architecture
  • Confidential data should be appropriately protected during transmission using such tools as encryption;
  • All confidential customer data traversing the internet should be encrypted to ensure data warehouse security; and
  • All confidential customer information should be encrypted when stored on the Web server for any length of time.
Network Data Warehouse Security Requirements
  • Computing and networking equipment should not be connected to networks and computing environment unless it has appropriate authorization and adequate security controls incorporated and in use;
  • Connections by remote computer systems and applications should be authenticated. This is especially important if the connection is via an open network that is outside the control of the client;
  • Authentication for computer-to-computer connections can be carried out at the application, computer or network level; and
  • Dedicated private lines can also be used to provide assurance of the source of connections.
System Administration
  • Use of direct modem access to servers and of server-based functionality with dial-out lines should be approved by the Security Manager;
  • Password protection should be used for system utilities;
  • System utilities and programs should be restricted and tightly controlled;
  • All clocks in systems and communications devices should be set to the correct time and date and the appropriate time zone e.g.  clocks should be automatically synchronized with a national standard Coordinated Universal Time (UTC) server, such as a National Institute of Standards and Technology (NIST) run atomic clock, within 6 leap seconds;
  • There should be a separation of duties e.g. system administrator accounts should not be used for system user tasks;
  • Default passwords should be changed following installation of software and patches;
  • Applications that run under a privileged ID or in a privileged mode should be configured and monitored so as to prevent misuse and manage security risk;
  • All default system and application passwords should be changed during the installation process;
  • Information Protection Technical Security Standards should be applied to ensure adequate security in system and application installation and configuration;
  • Authorization levels for system utilities should be defined and documented;
  • Software, software development tools and programs, and system utilities that are unnecessary should be removed;
  • Unnecessary operation systems or kernels should be removed;
  • Production systems should not be implemented with source code, programming libraries, development tools, or utilities not explicitly required to perform production-related functions;
  • Program source libraries should be configured and maintained with appropriate access control;
  • Formal procedures should be in place to ensure that the appropriate level of controls over changes to production environments; and
  • Administration of computing devices on the DMZ should be transmitted over channels that are secured end-to-end.
Backups, Logs and Audit (Logging)
  • Privileged account usage should be logged and monitored to help maintain data warehouse security;
  • All use of system utilities should be logged;
  • To ensure a complete record of events, logs should be produced that include: user IDs; dates and times for log-on and log-off; source IP addresses; terminal identity or location; records of successful and rejected attempts to access system, data and other computing resources;
  • Audit trails should be linked to the user identity responsible for the security relevant event;
  • Log files should be made available by system and application administrators in a readable format for periodic review by an independent observer who is trained and authorized in identifying security violations, malicious behaviors, and misuse of privileges;
  • There should be audit trails of security-related events to help ensure data warehouse security;
  • Logging and data collection should be in place for systems that have access to or process nonpublic personally identifiable customer information; and
  • Logging should be performed in a secure manner and provide analytical capabilities to identify authorized successful access, identify unauthorized access attempts and security violations, provide audit trails for user action as appropriate, and aid in reconstructing compromised systems.
Backups, Logs and Audit (Legal Requirements for Logging)
  • Browsing of disclosure statements and any other legal notices related to opening or changing the status of an account should be logged and archived so that it can be proved what the customer viewed and/or agreed to;
  • Customer actions on websites that impact or change their account needs to be archived until 7 years past the closing of the customer's account for data warehouse security purposes e.g. the opening of an account, changing the account's mailing address, paying a bill or transferring funds, and changing a password; and
  • When an account becomes internet-only e.g. no longer receive a paper bill, etc. additional archiving should be required to demonstrate that the customer saw statement information that would otherwise be available in the paper bill or other paper delivery of information. This allows the client to prove whether a customer saw a statement, fee, or policies.
Log Retention Requirements for Data Warehouse Security
  • Log file entries that are less than 60 days old are considered short-term logs and should be maintained in a manner that makes them accessible to an investigator within 2 hours of request;
  • Log file entries that are older than 60 days are considered long-term or archived logs and should be maintained in a manner that makes them accessible to an investigator within 24 hours of request;
  • All event logs should be retained and appropriately protected for 13 months and after this period, log files should be destroyed; and
  • Destruction of logs from online systems and backup media should ensure that the data is not recoverable by unauthorized persons.
Log Monitoring (Intrusion Detection)
  • Employees have the responsibility to report security-related incidents or suspicious activities immediately to help ensure data warehouse security;
  • There should be timely review of critical audit trails such as application system utilities and privileged user activities including failed log-in to root and all failed log-in attempts; and
  • Logging and data collection should be in place for systems that have access to or process nonpublic personally identifiable customer information to provide audit trails for user action as appropriate.
Availability and Reliability
  • Critical services should be protected against denial of service conditions;
  • There should be controls to ensure the ability to recover from failures; and
  • Business continuity plans should be established and validated.
Third Party Requirements for Data Warehouse Security
  • All confidential or proprietary information or materials should be safely disposed of when no longer needed e.g. by shredding paper and CDs or by fully deleting electronic files;
  • A non-disclosure agreement (NDA) should be signed by the third parties prior to any exchange of data or confidential information; and
  • Media containing data e.g. tapes, CDs, etc, should be secured while at a third party facility.
Outsourced Development Data Warehouse Security
  • Outsourced projects should have acceptance criteria and test plans to validate the source code;
  • Prior to implementation, a review and test of source code for security vulnerabilities, including covert channels or backdoors that might obscure unauthorized access into the system or application should be conducted and documented. This should not be performed by the third party that has been contracted to do the systems development;
  • Security controls for outsourced development include restricting third party access to production source code and systems, and monitoring their access to development systems; and
  • Outsourced application development should be tested to validate that information protection requirements are met before implementing the system or application in production.
Summary…

A data warehouse has a lot of moving parts and a lot of opportunities for security issues e.g. data resides in a source system, and can be accessed by authorized users. It is extracted to a staging area and transformed and possibly moved to a clearinghouse and then loaded to a data warehouse and extracted from the data warehouse and loaded into a data mart where end users can access it for analytic or reporting purposes.
Each stage of the data movement process involves a certain data warehouse security risk.

This site provided information management guidelines for data warehouse security.