FAQ Guide > Data Warehouse
1.Define Data Warehouse ?
“A subject-oriented , integrated , time-variant and non-volatile collection of data in support of management's decision making process”
2. What is junk dimension? What is the difference between junk dimension and degenerated dimension?
A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.where as A degenerate dimension is data that is dimensional in nature but stored in a fact table.
Junk dimension: the column which we are using rarely or not used, these columns are formed a dimension is called junk dimension
Degenerative dimension: the column which we use in dimension are degenerative dimension
Ex.Emp table has empno, ename, sal, job, deptno
But We are talking only the column empno, ename from the EMP table and forming a dimension this is called degenerative dimension
3.Differnce between Normalization and Denormalization?
Normalization is the process of removing redundancies.
OLTP uses the Normalization process
Denormalization is the process of allowing redundancies.
OLAP/DW uses the denormalized process to capture greater level of detailed data (each and every transaction)
4. Why fact table is in normal form?
A fact table consists of measurements of business requirements and foreign keys of dimensions tables as per business rules.
A fact table consists of measurements of business requirements and foreign keys of dimensions tables as per business rules.
There can just be SKs within a Star schema, which itself is de-Normalized. Now, if there were then FKs on the dimensions as well, I would agree. Being in normal form, more granularity is achieved with less coding i.e. less number of joins while retrieving the fact.
5. What is Difference between E-R Modeling and Dimensional Modeling?
Basic difference is E-R modeling will have logical and physical model. Dimensional model will have only physical model. E-R modeling is used for normalizing the OLTP database design.
Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design. Adding to the point:
E-R modeling revolves around the Entities and their relationships to capture the overall process of the system.
Dimensional model / Multidimensional Modeling revolves around Dimensions (point of analysis) for decision-making and not to capture the process.
In ER modeling the data is in normalized form. So more number of Joins, which may adversely affect the system performance. Whereas in Dimensional Modeling the data is denormalized, so less number of joins, by which system performance will improve.
6. What is conformed fact?
Conformed dimensions are the dimensions, which can be used across multiple Data Marts in combination with multiple facts tables accordingly
Conformed facts are allowed to have the same name in separate tables and can be combined and compared mathematically. Conformed dimensions are those tables that have a fixed structure. There will b no need to change the metadata of these tables and they can go along with any number of facts in that application without any changes
Dimension table, which is used, by more than one fact table is known as a conformed dimension.
|