A library of predefined data quality rules that can be self service to jump start a data quality initiative is needed. Data quality has become more important with the large amounts of data we are working with. A very time consuming activity is the time it takes to create new data quality rules using real business logic that is usually collected manually by SMEs or documentation. A general set of DQ rules by domain customer, product, finance, etc. that can be leveraged similar to the concept of "data classifications" which provide out of the box rules, although based on business data rules would address the use case necessary to manage data accurately and efficiently. The ability to also embed machine learning to learn typical DQ rules and generate new rules is also necessary to scale and keep up with the massive amounts of data ingested each day.
Why is it useful?
|Who would benefit from this IDEA?||As the Data Governance Product Owner within the Chief Data Office, I would like to migrate from Information Analyzer to Data Refinery and IGC to WKC to manage all my business and technical Data Quality / Data Preparation activities within our Cognitive Enterprise Data Platform and execute our CDO DG services using Data Refinery and WKC so that I can provide the necessary functionalities for our data producers and data consumers.|
How should it work?