When I google around, reading about data warehousing, I am amazed about the many aspects related to this subject and ditto discussions. Do we still know why we have all these discussions? What is so important about architecture, agile, data lakes, data vault 2.0, DevOps etc, etc?
In my opinion we discuss and implement all these theories, methodologies and frameworks to be able to design and implement 'standardized' data warehouse solutions which are maintainable and sustainable in the long run.
As an exercise over Christmas 2017, I created a mind-map containing many “keywords” buzzing around on fora, LinkedIn and articles related to data warehousing.
It is not my intention to discuss the details of this mind-map; to be complete or to be correct. But I do like to point out something. For a long time, I have a nagging feeling that there are aspects which are ignored or not much talked about. Aspects which are at least so important as where the buzzing is about right now.
For unknown reasons we do not have firm theories about Control-management. How do we make systems that are easy to maintain. Where are the execution-frameworks? Which methodology to follow to implement an 'abstraction' for the control of process-execution dependencies, restart-ability, crash-recovery etc, etc?
And, we talk a lot about data quality and control, but where are the frameworks? Where are the theories and methodologies to implement into a solution?
I am curious: am I the only one who sees this? Are you also recognizing these 'neglected' aspects and what are you doing about them?
I know a lot of '?'-signs in this article. My intention is to publish more on these 'neglected' aspects and fuel the discussion.
In my opinion we discuss and implement all these theories, methodologies and frameworks to be able to design and implement 'standardized' data warehouse solutions which are maintainable and sustainable in the long run.
As an exercise over Christmas 2017, I created a mind-map containing many “keywords” buzzing around on fora, LinkedIn and articles related to data warehousing.
It is not my intention to discuss the details of this mind-map; to be complete or to be correct. But I do like to point out something. For a long time, I have a nagging feeling that there are aspects which are ignored or not much talked about. Aspects which are at least so important as where the buzzing is about right now.
For unknown reasons we do not have firm theories about Control-management. How do we make systems that are easy to maintain. Where are the execution-frameworks? Which methodology to follow to implement an 'abstraction' for the control of process-execution dependencies, restart-ability, crash-recovery etc, etc?
And, we talk a lot about data quality and control, but where are the frameworks? Where are the theories and methodologies to implement into a solution?
I am curious: am I the only one who sees this? Are you also recognizing these 'neglected' aspects and what are you doing about them?
I know a lot of '?'-signs in this article. My intention is to publish more on these 'neglected' aspects and fuel the discussion.
Comments
Post a Comment