Data drives business Data Investigator Data quality KDI Applications cleansing mapping profiling pattern matching analysis analysing metadata correction. metadata correction Analysing matching pattern profiling mapping investigator Due Diligence Internal External Audit.
'Unleashing the Value of Information'





Site Map


Data Investigator  Data Investigator Product Description


Data Investigator is a tool for profiling, mapping and analysing data sources, detecting faults in data values, and correcting them. Data Investigator differs from other profiling tools in that it:

•   Is extremely easy to use
•   Is targeted predominantly at data owners (typically business users) whilst satisfying the functional needs of the IT department
•   Provides both profiling/mapping and cleansing capabilities in a single tool, empowering authorised users to take immediate action.
•   Accesses data directly on the source systems rather than extracting data for subsequent analysis; resulting in excellent  performance and ensuring currency of information
•   Requires no additional hardware or operations intervention

  Dirty Data Costs Money

Industry analysis shows that inconsistent and inaccurate data is costing organisations millions of pounds each year. This manifests itself in a number of ways:

•   Customers suffering poor service
•   Decisions being made using inappropriate information
•   Sub-optimal use of marketing resources
•   Inefficiencies in operational processes
•   Highly qualified IT resources being diverted from other high value activities

Given that data drives business, it is imperative that companies have confidence in their data.

Now, there are no excuses for Dirty Data




Data Investigator is architected as an Active Server Pages (ASP) application that runs under the control of MicroSoft’s Internet Information Services (IIS). Data is accessed via a combination of OLEDB providers and ODBC drivers, which enables connectivity to virtually any data source, for example, flat files, Excel, Oracle, UDB, etc. User interaction is via a WEB browser. The following diagram depicts the overall architecture.

As shown above, Data Investigator comprises a basic component and a Business Rules Engine which is surrounded by a comprehensive security and operational layer. Authorised users can access, interrogate & modify data via a simple, point & click user-interface. Security is provided at two distinct levels, firstly by the underlying database and secondly by Data Investigator’s user authorisation privileges. Batch processes are managed by the internal scheduler.

  Functional Overview

The following paragraphs highlight some of the key functional aspects of Data Investigator.

Access to data sources is controlled via a simple parameter file that specifies the data source name, description and location.


Once a user has been granted access to a data source via the parameter file, all subsequent information is derived automatically from the system. By default, data is presented by the 10 most commonly used values, top and bottom 10 values. This default can easily be changed by the user and the results re-displayed. The product allows users to visually inspect and update data in situ and to drill down to complete individual records to provide insight and context for those values.



Authorised users can easily correct erroneous data and an audit trail of all updates is maintained, including before and after images of the data. Where a user does not have the required privilege, Data Investigator can automatically generate the necessary SQL statement, which can be e-mailed for subsequent running by the Database Administrator.

  In addition to the base product there is a comprehensive business rules engine. The product comes with a number of pre-defined validation rules, such as Value Distribution, Patterns, Postcodes and VAT Numbers. Users can also generate their own validation rules, including comprehensive pattern matching on any data element and field value comparisons across multiple files/databases. In addition, Data Investigator supports the definition of company-specific business rules, which can be developed on an as required basis.

Shown below are example value distributions for alphanumeric and numeric fields. Date/time values are also supported.

  Typical Data Errors

Data errors typically fall into the following five categories:

•   Blatant
    -   Highlighted by displaying the most common, top and bottom values (e.g. invalid times, dates, ages, etc.)
•   Spelling / Duplicates
    -   Identified by displaying adjacent fields with similar values (e.g. names, descriptions, addresses, titles, activities, etc.)
•   Outside of Expected Range
    -   Data content analysed to show data value distribution. Values outside of the expected range are easily identified (e.g. age, amounts, dates, weights, distance, etc.)
•   Invalid Patterns
    -   Data patterns (alpha, numeric, etc.) automatically analysed and non-conforming values identified (e.g. structured fields, Postcodes, Product codes, Barcodes, VAT numbers, etc.)
•   Business Rules
    -   Records which conform to, or fail, predefined business rules are highlighted for further investigation (e.g. Account records without a corresponding Customer record, etc.)


Data Investigator delivers benefits to both business users and IT/IS departments by:

•   Reducing time and cost of audits, data investigations and corrections;
    -   quick and easy online analysis, profiling and correction of data content
•   Increasing confidence in data quality;
    -   proactive assessment and re-instatement of data accuracy
•   Increasing staff productivity;
    -   simple, point & click interface enables easy access to disparate data sources without the need for specialist technical expertise
•   Reducing data integration project risks and costs;
    -   Standish Group claim 88% of all data integration projects will fail or overrun
    -   errors detected during testing can cost up to 100 times more to correct than the same error found during design
    -   easy, dynamic analysis and verification of test results
•   Reducing time and cost of system recovery;
    -   rapid elimination of data issues during problem determination


Data Investigator has a myriad of practical applications, including:

•   Business function support
    -   Investigating content & accuracy of functional data resources
    -   Analysing data patterns, content ranges, etc
    -   Identifying & fixing inconsistencies in data across related systems
•   Internal / External Audit
    -   Gain easy access to data sources and investigate content & accuracy
    -   Test security and access control to production data sources
    -   Reduce the reliance on IT to provide information
•   Enterprise-wide data quality management
    -   Conducting comprehensive or targeted data audits/assessments
    -   Identifying & repairing major data errors and establishing the source of corruption
    -   Identifying & fixing inconsistencies & anomalies in critical data elements
•   Due diligence managers
    -   Quickly analyse the systems suitability of potential acquisitions
    -   Maintain a meta data repository for further analysis
•   Production system support
    -   Investigating & resolving data-related production system failures
•   Application development support
    -   Test data creation, modification and verification
    -   Comparing and/or verifying data values during unit, system & integration testing
    -   Establishing cause of data-related program failures
    -   Identifying & fixing data quality issues prior to major testing phases & implementation

  How to Order

For more information telephone

Steve Kibble on +44(0)7919 322540
download DataInvestigator Overview.doc.

KDI Applications Limited
The Dairy, Wrexham Road, Ridley, CW6 9SA
Tel: +44 (0)7919 322540