Sponsored Links

Friday, August 13, 2010

data warehouse best practices

Sponsored Links
data warehouse best practices
Data Warehousing is the innovation of the 90s who promised to change the landscape for good data. How far have we come? Many vendors have entered the market because it makes sense to bring together data from throughout the organization, and this will continue to make sense of the future.

How large a data warehouse market will grow no one knows yet. But it must still growing rapidly, and currently estimated at 4.5 billion dollars per year (IDC).

1. Why Run Into Data Warehouse Project Scope Creep?

To quote Bill Inmon (teacher and author of several books on Data Warehousing) "Traditional project begins with requirements and ends with data Data Warehousing. The project begins with data and end with the requirements." Once this project will take place, users will find new applications, and with it will come new demands for data. Interestingly, these projects are often justified by moving the T & R work away from "the data. What we have seen is that the first thing that happens immediately after the project is that it gives more demand for special requests submitted to this data is the same person''. This may appear to undermine the initial business case but actually signal the beginning of the creation of value from DWH project.

2. Star Schema Entity Relationship Model Versus?

There has been a major debate in society about the benefits of different data models. At the risk of over simplifying: ER models tend to have better performance (processing time) to end users, and is often regarded as "easier" to be understood by end users. Drawbacks is that the ER model requires more disk space, and, because of intrinsic redundancy in the data, have a consistency problem from the perspective of maintenance. Having said this, it seems that the practice is often some combination of the two can not be avoided in practical settings, although the preference (ER or Star) of the chief architects. Overall, the Star model seems to have gained the most ground.

3. Importance of Data Warehouse Business Case

Much has been written about the business case for Data Warehouse. What happened to good business case? IT savings everywhere in the business case DWH. The important thing is to not restrict to 'genuine savings', but to connect to the main business processes as much as possible. For example, more rapid cycle changes to the selection list fine (if the calculated charge per hour), but it's better if the revenue from the acquisition of more customers that follow from this choice can be tied into a relationship not only will revenue growth rather than savings for make a business case that is more balanced, more important is the intrinsic business buy-in results from a direct connection to the company's bottom line. These days, changes in legislation (especially the Sarbanes-Oxley) plays a major role in justifying the business case. This may be either through a higher company valuation to collect information that is transparent, or, lack of sleep the night for the CEO, which of course is priceless ...

4. Why Data Warehouse Project 'Do not' Go Wrong?

Actually, the Data Warehouse projects sometimes fail. But, they fail so rarely, that it is actually very hard to believe ... Especially after talking to so many end-user satisfaction. And there are many ways the project could be one of the Data Warehouse. Delivery on time, data administration problems, and data can not be avoided the issue of quality for food systems. Corporate politics (see Tip 7) may be the best explanation for this phenomenon at close to 100% success rate DWH project. In my experience, the reason why failures or 'semi-fail' can go unnoticed is the senior management either because they do not realize, or, say, "motivated" to talk about misspending company funds. As a result, not enough studied. Maybe we as consultants have a stake in this as well, because it ensures much business the industry is in progress ... J

5. What is different about Data Warehousing Web?

Kimball & Merz (2000): "Although this clickstream data in many cases is raw and real, have the potential to provide unprecedented detail about every move made by every human being by using Web media." Subatomic nature of clickstream data raises unique challenges. There is little built in feedback mechanisms to ensure data quality, compared with other data streams. The relationship between user and server logs record the mouse clicks are not so tight as in "traditional processing" transaction because of technical issues such as proxy servers and caching. Because of these differences, IT people need to adjust to the web process flow, rather than the process of adapting to the needs of IT as an interface common to most other DWH.

6. Should the data contained in the Data Warehouse?

Incoming data DWH ultimately determine its place in the organization. A "let's load all the data, to be" safe attitude is a sure way to derail your DWH project. Options for what should and should not be included needs to be created since the beginning, so that projects are managed. After the proven success of the delivered, deployed, and profitably exploited DWH, there will always be a place to put funds previously neglected interface. Given the anticipated life cycle of the DWH, it makes sense to consciously exclude certain sources. Options such as what data to include the need is driven by business considerations, and in particular reference to the company's bottom line. If you can not show how the data will be used profitably, they stay out! See also tip # 3.

7. Data Warehousing & Company Politics

Data warehouses have an impact on corporate bottom lines. Therefore, they may be candidates for turf battles, and also at risk of being "small changes" in negotiating budget allocations. None of the consideration of the benefits of long-term corporate goals. Managing projects is quite difficult because the DWH, and budget issues should not be made more difficult than it already is. Because DWH investment in current income and is located in the future, even more important to secure funds through sound business case and buy-ins from the appropriate (high) level of management. See also Tip # 3. Access to data means power, and talking about power management is one of the greatest taboos still exist. Sensitive as they are, even more easily discussed the budget ...

8. Trap Data Warehouse Project

Some 'frequently repeated barrier' on-time delivery route data warehouse project:

    * ETL process has eaten so much time (and still need to be "babysitters"), that little if any time left to develop applications that are required to exploit DWH
    * Some of the data needed, but were not not available, or not timely
    * Maintenance required for tuning, indexing, and backup and recovery is very underrated
    * Various ways of calculating the same phenomenon lead to different results, and no one could convince explain the difference (s)
    * Data is loaded (and recombination) which turned out to contain previously unknown inconsistencies in the source system, the 'classical' data quality problems that travel DWH projects
    * Metadata is less, and developer of the amount spent so much time figuring out what the field really means''


9. DWH Hardware and Software Go Hand in Hand

In Data Warehousing, not about the hardware, and not about the software: it is about the perfect integration of a second. Those who start their projects from both ends, will pay dearly for this mistake. The reasons are:

• In terms of price / performance, new, pre-integrated hardware-software combination takes the lead

° of the project management perspective, you do not want to caught between vendors when a proposed solution does not work as expected

· Database tuning and indexing is very important and very complex work, it should be left to specialists (in-house trained)

10. Performance is key

Although I do not often find this technology has become an important factor in the acceptance of the Data Warehouse, there is no other factor will be as important as performance. As size increased from time to time, this factor becomes more important. There are three reasons for this:

   1. performance has a major impact on the development of the velocity (initial load is always very time consuming), and therefore the overall maturity at the time of delivery DWH
   2. performance can make or break the end-user acceptance, particularly the predictability of performance
   3. performance has a tremendous impact on end-user productivity, the main driver of business pay-off

By Tom Breur, Article Source:
http://EzineArticles.com/?expert=Tom_Breur

No comments:

Post a Comment