Sterling Analysis
Unified Business Intelligence
 
Skip Navigation Links
Home
White Papers
Presentations
Glossary
Experience
Contact
 

Glossary

This glossary contains my personal definitions, in order to provide keys to the conclusions on this web site. The definitions should be close to those generally accepted in the industry. There may be some unique perspectives based on my experience.
The audience is:
  • Corporate people new to Business Intelligence in order to provide a general framework to possibilities, and
  • Experienced BI consultants who may very likely have differences with my terminologies and have their own views, which is expected and fosters welcome dialog.

Business Intelligence (BI) - Processes for reorganizing and optimizing operational data for instant reporting to permit executives to steer an enterprise. Such data is hierarchical and multidimensional and browsable.

A longer definition that I admire is from the book Business Intelligence for the Enterprise by my friend Mike Biere. ISBN 0-13-141303-1, (first edition) page 18: Part of it is reproduced here:

"Business Intelligence is the conscious, methodical transformation of data from any and all data sources into new forms to provide information that is business driven and results oriented ... The purpose of investing in BI is to transform from an environment that is reactive to one that is proactive, ... (and) ... to provide for analytics that are as tool independent as possible."

Corporate Bus or Enterprise Bus – this is an abstraction intended to describe the entire set of communications pathways such as networks, all information delivery means, such as reports, dashboards, ad hoc analytical tools, and all their potentials. By potentials is meant possible information delivery mechanisms which could be provided. The primary query language with the longest pedigree -- and hence the most widely known -- on the enterprise bus is SQL, Structured Query Language, for relational data.

Data Cleansing - Processes to insure that transaction data is accurate, internally consistent, and complete before it is loaded into a warehouse.

Data Warehouse - A time series repository of enterprise data required to be internally consistent, fully trustworthy, regulatory compliant, and a corporate wide shared resource for reporting and guiding the company. It is often expected to be able to answer any question about the entire enterprise.

Data Mart - A data base -- often called a "cube" -- about a single business subject, for example Sales. The structure is usually multidimensional, and normally confined to one Line of Business. It is designed to answer a specific but not entirely predetermined set of business questions about that single aspect of the business. Multidimensional structures inherently enable answering exceedingly complicated analytical questions and designing richly complex metrics about this bounded set of concerns. They are inherently hierarchical in nature and allow executives to brows and drill at will among critical business metrics across all possible sets of relationships in the mart.

Data Marthouse - A personal term I use to describe the intersection for sharing of data from data marts and data warehouses.

Dimension – A dimension is a homogeneous set of variables that therefore all go together. It could be referred to as a category. Time is a dimension. Product is a dimension. Geography is a dimension. Business are composed of natural hierarchies and dimensions are usually hierarchical. For example, he time dimension is composed of days, weeks, months, quarters and years. Products usually have hierarchies of product families. At each level of every hierarchy quantities such as total sales are aggregated, or in modern OLAP databases, potentially aggregated very quickly, often also with complex calculations, simultaneously across all dimensions at all levels of each hierarchy.

Multi Dimensionality – a way of thinking about business and of storing data to permit multidimensional analysis and usually provided within a multidimensional data mart. (See also "Multidimensional Thinking" on this web site.) A single data mart usually has +/- 10 dimensions. There is great variation and much professional disagreement on how many dimensions should be in a single data mart. However 25 dimensions is defined as a logical maximum by the "father" of relational multidimensionality Ralph Kimball. For an explanation of this perceived maximum, see The Data Warehouse Toolkit, 2nd Edition, by Ralph Kimball, page 58.

Normalization - Potentially complex transformations of data that can perhaps be briefly described as the most efficient provably correct way to store relational data. It provides maximum speed of updating a set of tables by limiting data redundancy, among other techniques. Normalized data is usually far less efficient at reporting than at updating.

OLAP - On Line Analytic Processing. This term was coined by Dr. E.F. Codd for Arbor Software, the company that invented the original modern multidimensional database called Essbase. The term has stuck, and defnes 12 rules to denote multidimensional databases. The original paper, may currently be retrived only from the Codd foundation, but many early BI MD practitioners have copies.

Operational Data Store (ODS) - transaction data that is minutes or hours old used for immediate reporting, and may or may not be lightly cleansed. (See Data Cleansing). ODS data updates the warehouse on a regular operational schedule. An ODS is an optional component of an IT infrastructure.

Star Schema - A relational representation of a multidimensional database. It is widely agreed to be the best, meaning fastest, most flexible, and most understandable by non technical people, way to organize relational data for reporting purposes.

Unified Business Intelligence - A term I am creating, although I am interested in hearing if it is in current use elsewhere in the BI discipline. It is the process of identifying and associating data elements and computed measures across data warehouses and data marts to derive new knowledge that could not be known before. It is for financial and non financial data. There are many non financial warehouse columns and mart cubes containing vital information for running companies that can help Call Centers, logistical decisions, shop floor work optimization, and so on. It is presently an untapped area of IT to find these synergies and use them for enhanced competitive profitability. A document on this site defines a process for capturing and using together data elements presently confined to warehouses or marts.

  Copyright © William J. Sterling