I was recently asked by one of our clients whether they should start building a data warehouse at one of the companies they acquired or not, and if yes, what are the arguments in favor of building it. This conversation inspired me to write this article.
(Note: I used to work as an economics and business professional before switching to IT and Business Intelligence. In this article I try to build a bridge between economics and IT, so apologies to all the IT professionals if, for the sake of non-IT readers, I simplify or even over-simplify some of the things mentioned here.)
When is NOT the right time to start building a data warehouse?
If you just started creating reports, building a reporting system
If you are just starting to put together your first real reports, it is very likely that even you do not know yet from which data and exactly how you should come up with the numbers you need to present to management. In this case, it is not worthwhile to start dealing with data warehousing or Business Intelligence yet. Most likely, in the process of reporting and analyzing the data, you will already get to a point, when:
- You will unify and simplify, for example by developing a common data dictionary
- You will do some OLTP-based development, for example unify the data collection (eg by implementing or expanding an ERP system) and move it to a common platform
- The content of the reports may not be accurate yet, so the purpose of the reports themselves will change constantly.
If you have already started creating reports, but for now a few (let’s say 3-4 complex or 5-6 simple) reports is enough to cover all the tasks required by your company.
If the size of the data is not very large (and at least one of the previous two scenarios are true)
If the frequency of data requests and the size of data related to them is NOT overloading the source systems (eg an ERP system) during the reporting
When SHOULD YOU start thinking about building data warehouses?
If you are drowning in the sea of Excel and / or Power BI reports
If you need a historical aspect to your corporate data
If you need a stable, fast connection for data of different quality, source and depth
The size of the data, the time allowed for report creation, or the frequency of the reports requires it
You need to run queries frequently and / or they are so big in size that they put a heavy load on the source system engine (e.g. a database engine)
In such cases, it is logical to start looking for a suitable data warehouse solution. Also, if the frequency of the reports is a pain point, you might check out our “Near Real-time data warehouse” blog article, that describes in detail how Abylon Rapid Analytics clears this hurdle.
Author of the post:
Zsolt Kreisz - BI Developer at Abylon Consulting Linkedin Profile