IBM Cloud Databases - Structured Ideas

This Ideas portal is being closed.  Please enter new idea at

Run custom scripts in Python and R to define data rules for profiling / Data Quality management

Add the ability to write / add custom scripts in Python and R to execute more complex data rules to perform data analysis and complete data preparation activities.  Currently in IIS we use the data stage tool to execute complex data operations that transform the data in preparation to execute Data Quality rules using the Information Analyzer tool and produce a set of results that comply or don't based on the rule.   There is also a 5K record limitation in Data Refinery.  For DQ there are specific use cases where you need to deep dive on the full data set for specific columns that have not passed the initial level of compliance.

We would like to migrate from IIS to the Data Refinery and WKC tools to support our Chief Data Office Data Quality program.  

  • Sonia Mezzetta
  • Dec 14 2018
  • Under Consideration
Why is it useful?
Who would benefit from this IDEA? As the Data Governance Product Owner within the Chief Data Office, I would like to migrate from Information Analyzer to Data Refinery and IGC to WKC to manage all my business and technical Data Quality / Data Preparation activities within our Cognitive Enterprise Data Platform and execute our CDO DG services using Data Refinery and WKC so that I can provide the necessary functionalities for our data producers and data consumers.
How should it work?
Idea Priority High
Priority Justification
Customer Name
Submitting Organization Other
Submitter Tags
  • Attach files
  • Susanna Tai commented
    December 14, 2018 17:58

    There are multiple requirements in here.  I'd break that down to:

    1. Support custom scripts in Python and R to execute more complex data rules to prepare data (Data Refinery)

    2. Allow profiling on the full data set - here's the Aha epic -->

    Need to confirm with Raj Rikhy or Sonia