Monday 5 August 2013

Integrated Data

In the data warehouse, data is not stored by operational applications, but by business subjects. 
.

Figure 2-1 the data warehouse is subject oriented.

In a data warehouse; there is no application flavor. The data in a data warehouse cut across applications.

Integrated Data

For proper decision making, you need 10 pull together all the relevant data from the various applications. The data in the data warehouse comes from several operational systems. Source data are in different databases, files, and data segments. These are disparate applications, so the operational platforms and operating systems could be different. The file layouts. Character code representations and field naming conventions all could be different.

In addition to data from internal operational systems. For many enterprises. Data from outside source is likely to be very important. Companies such as Metro. Mail. A. C. Nielsen, and IR I specialize in providing vital data on a regular basis. Your data warehouse may need data from such sources. This is one more variation in the mix of source data for a data warehouse.

Figure 2-2 illustrates a simple process of data integration for a banking institution. Here the data fed into the subject area of arrow:, in the data warehouse comes from three different operational applications. Even within just three applications, there could be several variations, Naming conventions could be different; attributes or data items could be different. The account number in the Savings Account application could be eight bytes long, but only six bytes in the Checking Account application.

Before die data from various disparate sources can be usefully stored in a data ware-house. you have to remove the inconsistencies. You have to standardize the various data elements and make sure ()I the meanings of data names in "each source application. Before moving the data into the data warehouse, you have to go through a process or transformation. Consolidation. And integration of the source data.

DATA WAREHOUSE: THE BUILDING BLOCKS

CHAPTER 2 DATA WAREHOUSE: THE BUILDING BLOCKS

CHAPTER OBJECTIVES
  • Review formal definitions of a data warehouse
  • Discuss the defining features
  • Distinguish between data warehouses and data marts t
  • Study each component or building block that makes up a data warehouse
  • Introduce metadata and highlight its significance
As we have seen the last chapter, the data warehouse is an information delivery system. In this system, you integrate and transform enterprise data into information suitable for strategic decision making. You take all the historic data from the various operational systems, combine this internal data with any relevant data from outside sources, and pull them together. You resolve any conflicts in the way data resides in different systems and transform the integrated data content into formal suitable for providing information to the various classes of users. Finally, you implement the information delivery methods.

In order to set up this information delivery system, you need different components or building blocks. These building blocks are arranged together in the most optimal way to serve the intended purpose; they are arranged in a suitable architecture. Before we get into the individual components and their arrangement in the overall architecture, let us first look at some fundamental features of the data warehouse.

Bill lemon, considered to be the father of Data Warehousing provides the following definition: "A Data Warehouse is a subject oriented. Integrated, nonvolatile, and lime variant collection of data in support of management's decisions." Sean Kelly, another leading data warehousing practitioner defines the data warehouse in the following way.

The data in the data warehouse is:
 

Separate
Available
Integrated
Time Stamped
Subject oriented
Nonvolatile
Accessible

DEFINING FEATURES

Let us examine some of the key defining features of the data warehouse based on these definitions. What about the nature of Mc data in the data warehouse? how is this data ferent from the data in any operational system'? Why does it have to be different? how is  the data content in the data warehouse used'?