Data warehouse – Test Data Mangement

Posted on Updated on

Test Data Management

Hi All,

Hope all are doing great….it’s very long time since I posted through my blog… my sincere apologies  😦

Happy Reading……:-)


Test Data availability is the one of and most significant issue that will lead to Schedule slippage in DWH projects.

So the Testers and Test Managers should know what all are requirements for the test data and they need to define the test data strategy.

The Test Data Management should cover the below points:

  1. Subset of PROD data catered for Testing Requirements

 Suppose if you are working in an Enhancement projects then you can avail the existing data that is already loaded into warehouse. In this instance you can’t use the data that’s pulled directly from Production as it’s sensitive to the Bank and its Customers.

  1. Mask the data as per regulatory and data privacy requirements 

All the customer information related information should be masked. It is difficult to test the Masked data, suppose if the Name column should accept 20 (CHAR) and if the data is masked then we cannot do the BVA for the masked column.

  1. Fabricate Test Data in case of un availability 

Your source system cannot provide you the data for all your Testing Scenarios. Lack of Test data is the major issue in the Data warehouse projects. A tester should analyze the system and need to create the test data in the form of Insert Scripts. The test data should be extracted correctly by the query before its loaded into the Target tables.

Typical Challenges involved in Test Data Management:

  1. Delay in Schedule

The test data from Upstream should be their UAT else our testing will not effective, Suppose if we are in SIT and our Up Stream have not Completed UAT then our Schedule will be postponed as unavailable of UAT tested data from Up Stream.

  1. Issues to cover Business Cases

Test Data for few Business Scenarios cannot be produced in Up Stream or through Data fabricating. Take this example, reopening of an Account, a Customer was having an account and he closed the same few years before and his Records are Kept Active in ACCT_STATUS Closed Records. When the Customer comes to Bank and the Bank will somehow Identify his existing account and will try to Reopen across all the tables to stop not having one more record in the warehouse, for this kind of scenarios it’s difficult to manufacture the data in a given time.

  1. Data Privacy & Compliance

Banks are curious about the Customer Data because of the Data theft in any means. So when we will get the data from PROD for our Regression Testing, most of the Customer related data will be masked as per Compliance policy. So we cannot produce the data or test those scenarios in our Environment.

  1. Dependency of SME’s

SME’s availability is a biggest issue in Data warehouse world.

System Architect – He is accountable for the System Design based on the Business Requirements. He will be closely working with Data Modeler (Data Scientist) to design the Source to Target. System Architect needs to fit the current requirement into the Existing or he should create an Application Landscape for the new requirement.

Data Modeler – He/She is accountable for designing the S2T. Data Modeler should know the Physical and Logical Design of the System and Database. For any Data warehouse project, the S2T is the important document.

ETL Developers – He/She should be accountable for High Level and Low Level design of the Requirements into Code. An ETL developer should be capable enough to design the ETL Code without compromising the performance of the Extraction and Lodging mechanism. He/She should know what kind of Transformation mechanism should be designed for the particular requirement.

Test Analyst / Test Managers – Test Managers should foresee all the technical and business requirements can be tested in given time frame, because the Test Data and System Behaviors might be changing which might cause the Schedule slippage.

  1. Availability of Adequate Data

We should start our System Testing with the Source System’s UAT Tested Data. If in case due to some issue if the Source System is not completed their System Testing, and they can provide only their System Tested data, then these data is not sufficient to continue our System Testing, so we need to fabricate the Test Data in order to kick off out Testing.

  1. Volume of Test Data

The Data threshold requirement will be declared by the System Owners. Whereas we cannot create that declared volume of data in our environment and test the performance beyond the threshold limit and below the threshold limit, so there might be Spool Space issues when we go to Production.

  1. Multiple Source & format of data

In case of Multiple Source Systems are participating in ETL, then it’s difficult to get the Data from different source systems on the same time to being our Testing, in this case again we need to create the Mock Test files to being our testing.

 8.     Version Control of Data

Versioning of Data is very difficult in the Data warehouse world whereas we can see the history of the data using few housekeeping columns (EFFT_D, EXPY_D). But the data loads and extracts should be versioned to avoid the confusions.




Chapter 10 – Developer and Tester Healthy Environment by Tom and Jerry Style

Posted on

Hi All,

I hope you guys are tired of reading stories behind the Business Intelligence Testing ! to make us relax Tom and Jerry gonna play as Developer and Tester role !!

Please have a look and enjoy yourself !!!

Project Starts Here 



Project Discussion 



Development phase – Coding and Unit testing




Testing Phase


The code deployment !


Defect reported by Tester 



Development Lead talks to Developer about Defect



Developer is telling it is not a defect


Tester is explaining it is a Defect



Developer fixed the Defect :



The defect is fixed and retested 



Developer and Tester have gone for Holiday after the deployment 


After they come back ! they came to know the project is running successfully in the Production 



Hope you enjoyed the Show – Catch you all Jerry’s in my next blog.

Chapter 9 – Understanding SQL by Selecting Indian Cricket team for South Africa tour using SQL

Posted on Updated on

Hi All,

Hope you have understood concepts what I discussed so far. Thanks for reading my blog !!!

Today I wanted to explain how we can write SQL – When I was new to Data warehouse , I asked Google about SQL. She gave so many links about SQLs. Hope you guys also have done the same. Now I will explain in my way about all the techniques in SQL using our Indian Cricket team.

Anyone wants to know What is Structured Query Language (SQL) then please ask Google, she will explain better than me 🙂

I will explain how we can easily understand and use SQL effectively

In SQL you will have 3 Segments

Select Clause –

1.You can mention the columns from the tables

2.You can set Conditional statements , Case When, IF and Else, Counters etc.,

3.You can use SUB STRING, CASTE Functions, TRIM functions etc.,

From Clause –

1. Here you need to mention your Database.Table Name (alias name can be given for the tables)

Where Clause – 

1. Here you need to mention what all records you need from your tables , using the relational operators (=,<>,>=,<= ,and, or)

Now you got the result using Select From Where clause – do you want to join this result with you neighbor table then use the Joins (Inner Join, Left Join, Right Join, Cross Joins)

Indian board of directors wants to see the eligible players from different states:


In India we have 28 States and 7 Union Territories,each state has their own Cricket Club and connected to Indian Cricket board.

Each State team details will be kept in their own Tables and Director of Indian Cricket team will join all the tables and pull the final list of the Players.

To make it simple I will use 3 States – Tamil Nadu, Bangalore, Delhi.

Team Tamil Nadu:

In Tamil Nadu team Ian and Arya are retried so the director should not see them in his final team list


In this team director wants players are

1. Aged below 40

2. Minimum 10 Matches

3.All rounders

4. Only active players

The SQL :

SEL Team,Team_ID,Player_ID,Player_Name, 


END AS SELECTION,AGE,No_of_Matches, Hundreds, Fifty’s, Bating, Bowling, All_Rounder,Effective_from,Expired_On,




AGE<=’40’ AND All_Rounder=’Y’ AND and Expired_On=Cast(’31-12-999′ as Date Format ‘YYYYMMDD’)

Just think using the rules and query what all are the records you will get from Tamil Nadu team!!!

Team Bangalore:

In Bangalore team Mickey and Raghav are retried so the director should not see them in his final team list


In this team director wants players are

1. Aged below 30

2. Minimum 20 Matches

3.All Bates man

4. Only active players

The SQL :

SEL Team,Team_ID,Player_ID,Player_Name, 


END AS SELECTION,AGE,No_of_Matches, Hundreds, Fifty’s, Bowling, All_Rounder,Effective_from,Expired_On,


TEAM.TABLE_Bangalore as Bangalore


AGE<=’30’ AND All_Rounder=’Y’ AND and Expired_On=Cast(’31-12-999′ as Date Format ‘YYYYMMDD’) and Matches <=’20’

Team Delhi:

In Tamil Nadu team Sachin and Agarker are retried so the director should not see them in his final team list


In this team director wants players are

1. Aged below 25

2. Minimum 30 Matches

3.All bowlers

4. Only active players

The SQL :

SEL Team,Team_ID,Player_ID,Player_Name, 


END AS SELECTION,AGE,No_of_Matches, Hundreds, Fifty’s, Bating, All_Rounder,Effective_from,Expired_On,


TEAM.TABLE_Delhi as Delhi


AGE<=’25’ AND All_Rounder=’Y’ AND and Expired_On=Cast(’31-12-999′ as Date Format ‘YYYYMMDD’) and Matches <=’30’

Now the director got the results from each of the Tables. Now he wants to join all them and load it into a Selection table.

Its very simple , all alias names used to get the columns from their tables, and the column Team is key for all the three tables and used for joining the tables.

Select Final_List*



(Copy the Tamil Nadu Query) as team_1

Left join

(Copy the Bangalore Query) as team_2

on team_1.Team=team_2. Team

Left join 

(Copy the Delhi Query) as team_3

on team_2.Team=team_3. Team

) Final_List

Now the Director got the team he wants !!! India team is now ready for South Africa trip 🙂

Hope you guys understood the scenario, use this as base scenarios and play your cases with the help of our friend Miss Google.