FAQ Guide > Data Warehouse
7. What are the methodologies of Data Warehousing?
They are mainly 2 methods.
1. Ralph Kimbell Model (Top - Down approach :: Data Warehouse --> Data Mart)
Kimball model always structured as Denormalized structure.
2. Inmon Model. (Bottom - Up approach :: Data Mart --> Data Warehouse)
Inmon model structured as Normalized structure.
8. What are data validation strategies for data mart validation after loading process?
Data validation is to make sure that the loaded data is accurate and meets the business requirements. Strategies are different methods followed to meet the validation requirements.
9. What is surrogate key?
Surrogate key is the primary key for the Dimensional table. Surrogate key is a substitution for the natural primary key.
Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.
It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult and also used in SCDs to preserve historical data.
10. What is meant by metadata in context of a Data warehouse and how it is important?
Metadata or Meta data is data about data. Examples of metadata include data element descriptions, data type descriptions, attribute/property descriptions, range/domain descriptions, and process/method descriptions. The repository environment encompasses all corporate metadata resources: database catalogs, data dictionaries, and navigation services. Metadata includes things like the name, length, valid values, and description of a data element. Metadata is stored in a data dictionary and repository. It insulates the data warehouse from changes in the schema of operational systems. Metadata Synchronization The process of consolidating, relating and synchronizing data elements with the same or similar meaning from different systems. Metadata synchronization joins these differing elements together in the data warehouse to allow for easier access.
In context of a Data warehouse metadata is meant the information about the data. This information is stored in the designer repository. Meta data is the data about data; Business Analyst or data modeler usually capture information about data - the source (where and how the data is originated), nature of data (char, varchar, nullable, existance, valid values etc) and behavior of data (how it is modified / derived and the life cycle) in data dictionary a.k.a metadata. Metadata is also presented at the Data mart level, subsets, fact and dimensions, ODS etc. For a DW user, metadata provides vital information for analysis / DSS.
11. What are the possible data marts in Retail sales?
Product information, sales information
12. What is the main difference between schema in RDBMS and schemas in Data Warehouse?
RDBMS Schema
* Used for OLTP systems
* Traditional and old schema
* Normalized
* Difficult to understand and navigate
* Cannot solve extract and complex problems
* Poorly modelled |
DWH Schema
* Used for OLAP systems
* New generation schema
* De Normalized
* Easy to understand and navigate
* Extract and complex problems can be easily solved
* Very good model |
13.What is Dimensional Modeling?
In Dimensional Modeling, Data is stored in two kinds of tables: Fact Tables and Dimension tables.
Fact Table contains fact data e.g. sales, revenue, profit etc.....
Dimension table contains dimensional data such as Product Id, product name, product description etc.....
Dimensional Modeling is a design concept used by many data warehouse designers to build their data warehouse. In this design model all the data is stored in two types of tables - Facts table and Dimension table. Fact table contains the facts/measurements of the business and the dimension table contains the context of measurements i.e., the dimensions on which the facts are calculated.
|