IBM Watson Data & AI - Structured Ideas

This Ideas portal is being closed.  Please enter new idea at http://ibm.biz/IBMAnalyticsIdeasPortal

Run custom scripts in Python and R to define data rules for profiling / Data Quality management

Add the ability to write / add custom scripts in Python and R to execute more complex data rules to perform data analysis and complete data preparation activities.  Currently in IIS we use the data stage tool to execute complex data operations that transform the data in preparation to execute Data Quality rules using the Information Analyzer tool and produce a set of results that comply or don't based on the rule.   There is also a 5K record limitation in Data Refinery.  For DQ there are specific use cases where you need to deep dive on the full data set for specific columns that have not passed the initial level of compliance.

We would like to migrate from IIS to the Data Refinery and WKC tools to support our Chief Data Office Data Quality program.  

  • Sonia Mezzetta
  • Dec 14 2018
  • Under Consideration
Customer Name
Role Summary
  • Attach files
  • Admin
    SUSANNA TAI commented
    December 14, 2018 17:58

    There are multiple requirements in here.  I'd break that down to:

    1. Support custom scripts in Python and R to execute more complex data rules to prepare data (Data Refinery)

    2. Allow profiling on the full data set - here's the Aha epic --> https://bigblue.aha.io/features/CLASS-55

    Need to confirm with Raj Rikhy or Sonia