Test Data Management
Hope all are doing great….it’s very long time since I posted through my blog… my sincere apologies 😦
Test Data availability is the one of and most significant issue that will lead to Schedule slippage in DWH projects.
So the Testers and Test Managers should know what all are requirements for the test data and they need to define the test data strategy.
The Test Data Management should cover the below points:
- Subset of PROD data catered for Testing Requirements
Suppose if you are working in an Enhancement projects then you can avail the existing data that is already loaded into warehouse. In this instance you can’t use the data that’s pulled directly from Production as it’s sensitive to the Bank and its Customers.
- Mask the data as per regulatory and data privacy requirements
All the customer information related information should be masked. It is difficult to test the Masked data, suppose if the Name column should accept 20 (CHAR) and if the data is masked then we cannot do the BVA for the masked column.
- Fabricate Test Data in case of un availability
Your source system cannot provide you the data for all your Testing Scenarios. Lack of Test data is the major issue in the Data warehouse projects. A tester should analyze the system and need to create the test data in the form of Insert Scripts. The test data should be extracted correctly by the query before its loaded into the Target tables.
Typical Challenges involved in Test Data Management:
- Delay in Schedule
The test data from Upstream should be their UAT else our testing will not effective, Suppose if we are in SIT and our Up Stream have not Completed UAT then our Schedule will be postponed as unavailable of UAT tested data from Up Stream.
- Issues to cover Business Cases
Test Data for few Business Scenarios cannot be produced in Up Stream or through Data fabricating. Take this example, reopening of an Account, a Customer was having an account and he closed the same few years before and his Records are Kept Active in ACCT_STATUS Closed Records. When the Customer comes to Bank and the Bank will somehow Identify his existing account and will try to Reopen across all the tables to stop not having one more record in the warehouse, for this kind of scenarios it’s difficult to manufacture the data in a given time.
- Data Privacy & Compliance
Banks are curious about the Customer Data because of the Data theft in any means. So when we will get the data from PROD for our Regression Testing, most of the Customer related data will be masked as per Compliance policy. So we cannot produce the data or test those scenarios in our Environment.
- Dependency of SME’s
SME’s availability is a biggest issue in Data warehouse world.
System Architect – He is accountable for the System Design based on the Business Requirements. He will be closely working with Data Modeler (Data Scientist) to design the Source to Target. System Architect needs to fit the current requirement into the Existing or he should create an Application Landscape for the new requirement.
Data Modeler – He/She is accountable for designing the S2T. Data Modeler should know the Physical and Logical Design of the System and Database. For any Data warehouse project, the S2T is the important document.
ETL Developers – He/She should be accountable for High Level and Low Level design of the Requirements into Code. An ETL developer should be capable enough to design the ETL Code without compromising the performance of the Extraction and Lodging mechanism. He/She should know what kind of Transformation mechanism should be designed for the particular requirement.
Test Analyst / Test Managers – Test Managers should foresee all the technical and business requirements can be tested in given time frame, because the Test Data and System Behaviors might be changing which might cause the Schedule slippage.
- Availability of Adequate Data
We should start our System Testing with the Source System’s UAT Tested Data. If in case due to some issue if the Source System is not completed their System Testing, and they can provide only their System Tested data, then these data is not sufficient to continue our System Testing, so we need to fabricate the Test Data in order to kick off out Testing.
- Volume of Test Data
The Data threshold requirement will be declared by the System Owners. Whereas we cannot create that declared volume of data in our environment and test the performance beyond the threshold limit and below the threshold limit, so there might be Spool Space issues when we go to Production.
- Multiple Source & format of data
In case of Multiple Source Systems are participating in ETL, then it’s difficult to get the Data from different source systems on the same time to being our Testing, in this case again we need to create the Mock Test files to being our testing.
8. Version Control of Data
Versioning of Data is very difficult in the Data warehouse world whereas we can see the history of the data using few housekeeping columns (EFFT_D, EXPY_D). But the data loads and extracts should be versioned to avoid the confusions.
Quality Assurance and Quality Control
Being a Test Professionals we should know about QA and QC. What we are doing is actually Quality Control related stuff and what people involved in bringing up CMMI, ISO standards are Quality Assurance. Quality Assurance is more likely related to maintain the process in any given Project or Programme. Quality Control is nothing but giving the right product by verifying the requirements are covered and working as expected.
We know there are multiple levels of testing methods are spoken in the testing world and we have the methodology by which we need to execute them like System Test, Integration Test etc., and Methodology like Water fall, Iterative etc.,
Let me explain what I know and aware of Quality Assurance:
There 3 different roles who will be responsible for assuring the process in any projects.
- PQA – Project Quality Analyst
- CC – Configuration Controller
- DPA – Defect Prevention Analyst
Project Quality Analyst – PQA role
A person who involved in this role needs to know the pre-defined industry standards of that Organization.
PQA’s Roles and Responsibilities
- Documents Naming Convention met as per the Industry Standard
- Names of who prepared , reviewed , approved the deliverables
- Reviews happened across all the Customer facing documents
- Review defects are found, fixed, verified and captured for Metrics
- Checking whether all the deliverables are kept in a Common place where the stake holders can access
- Checking all the necessary deliverables are prepared by the project team
- Checking the actual project delivery date and the date mentioned in the documents are same
- Checking the Stake Holders names, document owner names are mentioned correctly in all customer deliverables
- Differentiating the Customer facing deliverables and Internal audit specific deliverables are as per the industry standards
- Verifying the Entry Criteria and Exit Criteria of any Levels in SDLC are met and collecting the proofs for the same
- PQA’s will be involved in all levels of SDLC
Business Analyst Teams will have separate sets of Deliverables like Business Requirement documents, Business Understanding Documents, Requirement Traceability documents etc.,
- Development teams will have separate sets of Deliverables like High Level Design, Low Level Design , Functional Specifications etc.,
- Testing teams will have separate sets of documents like Test Plans, Test Conditions
The PQA should validate all the documents that supposed to be delivered to the Clients and maintain for internal audits
CC – Configuration Controller
Configuration Controller who controls the versions and the placement of the documents in tools like VSS – Microsoft Virtual Source Safe or Documentum etc.,
Configuration Controller Roles and Responsibilities
- CC’s are responsible of Creating the folder structures in VSS or Documentum
Like, in any Projects the following folders will be created to maintain the projects deliverables
- Project Kick off
- Minutes of Meeting
- Review Log
1.1. High Level design
1.2. Low Level Design
1.3. Issue Log
1.1 Unit Testing
1.2 System Testing
1.3 System Integration Testing
1.4 User Acceptance Testing
CC’s will have the Admin rights to Grant and Revoke access to folders.
Developers should not have access to the folders related to Testing and vice versa
- CC’s will maintain the Check in and Check out of the documents that goes into VSS
- CC’s will maintain the relevant documents are kept in corresponding folders in VSS
DPA – Defect Prevention Analyst
Defect Prevent Analysts will maintain the defects across the SDLC. For any raised defects the work flow should be maintained. Proper comments for those defects should be given when they are created. All high severity defects should be fixed from one Phase to next phase to being with.
As testers when we raise defects we need to concentrate on applying Defect Cause and Defect Type in any Defect Management tool. This will help DPA’s to classify the defects and comes up prevention tips.
Defect Cause – What is the root cause of the defect that is,
- Is the defect caused because of the Upstream Source Data or Test Data
- Is the defect caused because of Incomplete of Missing requirements
- Is the defect caused because the Inconsistent Requirement
- Is the defect caused because of the Code discrepancy
- If you find any anomalies in any Documents then raise the defects to Artefacts
- If any of your defects leads to Changes in the Requirement then please raise them as Change Request – CR can be on the Actual Business Requirement or on Design Changes.
Defect Type – Classifications of the Defects that is,
- Is the defect related to Data Error
- Is the defect related to Application Code
- Is the defect related to Work Request
- Is the defect related to Change Request
- Is the defect related to Deployment
- Is the defect related to Specification
- Is the defect related to Artefact
- Is the defect related to Production
- Is the defect related to Process
- Is the defect related to Environment
- Is the defect related to Specification
- Is the defect related to Requirements
- Is the defect related to Reviews
DPA’s most prominent work is to prepare the CAPA – “Corrective Analysis and Preventive Actions”
DPA roles and Responsibilities
- DPA’s will collect the Metrics related to Defects in a periodic manner – Weekly, Monthly or Ad-hoc
- DPA’s will collect the defects by Defect classifications like in a given period of time how many defects are raised in Reviews, Code, Requirement Change and collects the cause of the defects
- Then using the metrics that was retrieved from any Defect Management tools, they design the Fish Bone diagram by filling the necessary details
- Using the Statistical tools like Minitab, they calculate the Defect Density of the defects from each phase
- Then they will create the Prevention actions on the areas where the Defect Density is above to the threshold limits.
Suppose if your Organization has the Defect Density threshold limit as 0.5 and your Defects under Review defect type is more than 0.5 then they will ask the Reviewers to do their work better to minimize the review defects at any levels of SDLC.
Cheers – Asik
HP Quality Center is quality management software offered from the HP Software Division of Hewlett-Packard with many capabilities acquired from Mercury Interactive Corporation. HP Quality Center offers software quality assurance, including requirements management, test management and business process testing for IT and application environments. HP Quality Center is a component of the HP Application Life cycle Management software solution set. (Taken from Wiki)
I have been working in QC right from version 9.0 to 11.0. There are so many key functionalities added to the new version.
I hope most of them worked in Quality Centre as a Management Tool. Here I will explain for those who want to learn QC and its modules.
QC is a web tool, user can access from anywhere but only in INTRANET (specific to your company we). You cannot access outside of your company. Projects will be created and by QC Admins and the access to the projects will be given by QC Admins as per your Role. Quality Center has so many functionalities and a user cannot access all of them and can use most of them based on the access given.
- If you are a Test Analyst, you will not have access to delete any entity that’s created folder level.
- If you are a Test Analyst you will not have access to move the defects from Analysis status to retest status.
- If you are a Developer you will only have access to move the defects from Analysis status to retest status.
- If you are a Test Lead / Dev Lead , you will have most of the access
- If you are a Test Manager you will have access to all the functionalities existing in Dashboard module
Logging into Quality Center:
- Use the URL (Web Link)
- Login Name – Enter your User Id (Mostly Windows Login user Name)
- Password – Enter your password (You will be entitled to a Default password @ the time of user id creation and you can change the password by Tools à Customize à Change Password.
- After entering User Name and Password and Check the Authentication of your user name and password by clicking Authentication button.
- After Clicking it, it will show the Domain and Project Name that is assigned to you, If you are assigned to more than one project the use the drop down and select the Project that you wanted to work.
- You can change the project from the QC Home Screen (Top Left Corner) Change project.
Module 1à Dashboard Module:
In Dashboard Module user can see two types of View, Analysis and Dashboard View. Analysis view can be used to get the reports for the particular Test Sets, Defects from particular Test Sets and Reports from particular project. Dashboard views will be used by the Test Managers, and they can create a report across programme level. In each View you will have two folders, Public and Private. If you create any Reports under Private then it’s visible to the creator only but if you create a report in Public then it’s visible to all the users who are accessed to the Domain and the Project.
1. Analysis View
In this user can create a Graphical Report or Standard Report. There is a button (+), if you click you can see
Graph Wizard, New Graph, New project Report, New Excel Report and New Standard Report. Click on the type of the report that you wanted and choose where you wanted to post the report either in Private or in Public.
Now I am using New Graph functionality.
- Click on New Graph (Under Private section)
- Upon clicking you will get a New Window where you need to select1.
- Entity (Defects, Tests etc.,)
- Graph Type (Progress , Age, Summary, Trend)
- Graph Name (Project Name – User Specific)
- After creating the Test Report, you can see the Project Name that you have created under the Private Section. (Left Side of the tool)
- Click on the Project then check your right hand side, you can see 3 tabs, Details, Configuration and View.
- Details tab – the use can’t edit anything, its auto populates the values that we had given when created the project.
- Configuration tab – this is the place that we need set up what we need in the report.
- X- Axis – Choose from the drop down what you need to see in the X- Axis
- Y- Axis – Choose from the drop down what you need to see in the Y- Axis
- Grouped by – Choose the entity by which the X-Axis and Y-Axis needs to be grouped by
- And you can see a button Filter (Funnel Symbol) click on that then you will get a Pop up window where you need to set the target folder from where you get the data for the Graph.
- In that window you will have Filter and Cross filter tabs. Choose Cross Filter tab and you can see different sections like Defects, Test Sets, Requirements etc,
- Choose the Entity that you need, right now I want to show to my lead how many test cases that I have executed over a period of time so I will choose Test Set section.
- Under Test Set section you will have 3 radio button, click on the radio button that shows next to None radio button.
- At the end of the text box you will have (…) 3 dots. Click on it. It will take you to the Test Lab where you can see all the Test Sets created under a given domain.
- Select the Test Set that you want from the Test Set tree.
- And Click Ok now Click on View you can see the Graphical representation of the data that you have in your Test Lab.
- By Clicking Data Grid on View tab you can see the data in Numbers.
Module 2 > Management Module:
Management module will allow the Test Managers or Test Leads to set up the Cycle Start and Cycle End date to the given Test Set in Test Lab. A user can’t access a Test Set after the Cycle End date. This module can give the data to drive the Test Metrics related to Effort.
Module 3 > Requirements Module:
Business requirements are captured in this module. You can add the actual business scenarios or you can add the test conditions as requirements. You can add requirements by simple clicking the Add New Requirement button or you can upload using the Excel upload add in manager.
By capturing the Testable requirements we can achieve the requirement traceability. Suppose if you are testing a log in screen then you will set up a test condition as
‘Verify the Login Screen is working as expected by giving valid user name and password’
This test conditions can be furthermore explained in detail steps under Test Plan module.
The key points that we need to maintain when setting up the Test Conditions are
- We need to set the Priority to each test conditions
- We need to write the meaning full test condition name
- We need to choose the correct requirement type from the drop down box (Right top most corner)
- Author name will be auto populated from your Login credential (if I logged in then my name will be populated as Author name)
- You can’t delete a created Test Requirement whereas you can Cut and Paste into Recycle Bin folder. (Requirement, Test Plan and Test Lab modules will have the Recycle Bin to keep the rubbish contents)
- The created Test Conditions will be mapped to Test Cases from the Test Plan module. If you have not mapped then you can see Direct Cover Status as Not Covered, if you have mapped but not executed then you can see it as Not Completed, you have executed the Test Case if the test case is Passed or Failed then you can see it in Direct Cover Status.
Module 4 > Testing
Under Testing module we will have below mentioned tabs
Test Resources – This tab will be used for QTP to keep the Scripts
Test Plan – Marinating all detailed steps involved in particular Test Conditions
Test Lab – We will pull the test cases under Test Lab for Execution
We can simply add a Test Case using New Test button or we can write the test cases in spread sheet and upload it into QC using QC Excel Add in Manager.
The key points that we need to maintain when setting up the Test Plan are:
- Click on the New Test button , you can see a window opened
- Enter a Valid test name, mostly it should match with the name that we had given for test conditions
- Select the type of the test that we are performing (right top most corner)
- If you see any sections are marked as Red then it’s a mandatory column and you need to enter the value.
- Select the SDLC phase from drop down box like System Test, E2E Test, Regression Test etc
- Select the Priority from the drop down box, this priority should match with the priority that we had given for Test Conditions.
- Select the Test Type from drop down , what kind of testing that you are doing, like Functional or Non Functional etc.,
- Capability mostly Do Not Know
- Select the Application that you are going to work in and this will be pre-defined by the QC Admin team
- When you are creating the Test Case please keep the Reviewed question as Not Reviewed and assign the Reviewer name.
- Once you save the Test Case the Auto Email will be sent to the Reviewer.
- Now reviewer will be reviewing the test case and will set the status as Reviewed.
- Under the Description Section we need to enter the following details, these details will be pre-defined by the QC Admin team and will be populated to all the resources associated to the domain.
- We have set up the Test Case, now we need to map this Test Case to the Test Conditions. To do that please Click on the Test Case name from the left side pan so that you can see the below mentioned tabs on your right side
- Summary – All the details related to that test cases (what we had given when creating) will be populated
- Design Steps – will have the Details steps that will be performed on the Test Conditions
- Parameters – the data that needs to be passed for Automation Frameworks
- Test Configuration – auto populates the values
- Attachments – if you like to add any documents that related to this particular scenario you can add under this tab
- Requirement Coverage – from this tab we can search the Test Requirement (test conditions) that related to the test case and drag and drop using Select Requirement button
- Linked Defects – Defects can be attached to Test Case or to Test Steps. So the attached Defects will be shown under this tab
- Dependency – This tab is used for Automation tests, where you need to create dependency between different modules.
PS: After mapping the Test Conditions to Test Cases please go to Requirement module and check the Direct Cover Status, if the Status is Covered then leave it else Refresh it reflect the status.
We have set up the Test Conditions and Test Cases and linked them now we need to pull them into Test Lab and make them available to Test Execution.
Before pulling the Test Cases into Test Lab, please create a Test Set folder using New Test Set button. By Clicking it you will be getting a window where we need to enter the logical test set name.
Now we have created the Test Set and pull the test cases into test lab by going to Execution Grid tab in the Test Lab module and Click Select Tests this will take you to Test Plan module where you browse the test cases that you wanted to move.
Pull all the test cases that you wanted. All the modules in QC you will have the Select Columns button to customize the details that you wanted to display in the Screen, so play across the needed columns and make them available in your screen.
The key points that we need to maintain when setting up the Test Lab are:
- Running the Test Case
- You have Run button to execute the test case. In Run we will have Run With Manual Runner and Continue Manual Run.
- By simply clicking Run will lead you to Run with Manual Runner option. Now you can see the test steps that are written in Test Plan module.
- To pass the Test Case, we can click CTL+P to fail the Test Case CTL+F. Please attach a Test Evidence for each Test Case that you are executing.
- Do not execute the test cases more than one, if you do then it will create an Instance for each time that you are executing this will annoy your test reports
- If you kept a Test Step as Not Completed for any reason @ first time of execution and you wanted to execute that particular test step then choose Continue Manual Run under RUN tab.
- If you are failing a Step then you can create a Defect from Test Plan module.
- After failing the Defect, go to Linked Defects tab and Click Add (+) button it will take you to defect module from there you can create defect and it automatically linked to that particular step.
- If you are keeping a Test Case as Not Applicable then you need to attach a Evidence why this test cases are chosen as Not Applicable
- If you are keeping a Test Case as Deferred then you need to attach a Evidence why this test cases are chosen as Deferred
Module 5 > Defects
Defects can be added from Test Plan module or from Defect Module. If a Defect is attached to a test cases then please do link the test case to this defect. You can raise an Orphan defects without linking them to any test cases.
Go to defect module and click Add Defect button then you will get a window where we need to input the defect details.
- Summary – Brief description about defects
- There are few columns will be auto populated as per the QC configuration
- When we raise the Defects it will be in New Status as per the Defect Life Cycle
- Defect Type – We need to select the defect type from the drop down like Application Code, Requirement Defect etc.,
- Discovery Phase – We need to select on which phase the defect is Injected like System Test, UAT, E2E etc.,
And there will be more than 10 mandatory fields that we need to enter as per the project specific details.
- Description – Testers needs to give the detailed description about the defect in Description section
Test User Name/ID
Steps to Replicate
Test Data Reference
Test Case Reference [Test Case Name (Step number)]
Defect Life Cycle in QC
- New – (When the defect is created)
- Analysis – (When the defect is moved to Developers)
- Fix – (When the fix is given by the developers)
- Deploy –(when the developers deployed the defect fixed code into test environment)
- Retest – (When the code is ready for retesting)
- Closed – (when the defect is retested and closed by the testers – once you closed the defect we can’t modify the defect.
Note: User can’t jump from one status to another status by by-passing one status in between.
Setting up the Priority and Severity to the Defects:
Do you remember we have kept Priority to a Test Requirement when we have created in Requirement module? I have raised a Defect that related to the Test Case which was set to Priority Low then keep Defect priority as Low.
Cheers – Asik
In this post I would like to share my knowledge in Non Functional Testing in Data warehouse testing.
There are different types non-functional testing that we do in testing world, some of them is
- Baseline testing
- Compatibility testing
- Compliance testing
- Documentation testing
- Endurance testing
- Load testing
- Localization testing and Internationalization testing
- Performance testing
- Recovery testing
- Resilience testing
- Security testing
- Scalability testing
- Stress testing
- Usability testing
- Volume testing
To me Non Functional testing is something like which will not give any business values; It’s something like dealing with the environment. When we extract the data from heterogeneous source system, we might need to think of handling
Verifying the volume of the data
Any business can’t ensure what could be the volume of the data that they will send. They can say approximately, Our Code should have the capability of pulling the maximum number of data that they source system can send at any point of the time. To manage the Volume of the data, Teradata has the feature called M-Load and T-Pump. When developers designs the system they fix a limit by which data will be loaded into Warehouse.
- M-Load – If we get a data file with 100 records then the records will be loaded by M-Load functionality
- T-Pump – If we get a data file with less than 100 records then the records will be loaded by T-Pump
What we need to test here is, send a file with 100 records and check records are loaded by M-Load. This can be verified using the Load Job Names.
Verifying Date and Time of the Data file arrival to the Unix Landing directory
Most of the Companies will not function on Week Ends, Public Holidays so our source systems will not send any transactional data on those days. Because of the phenomenon developers will design their jobs to archive any files coming on these days.
Normally, Monday’s transactional data will come to us for loading on Tuesday early morning and it will end on Fridays transactional data will hit us on Saturday early morning.
We as testers need to verify these schedules are working as per the specification. This can be achieved
- sending a file on Week End and check this file is archived
- Sending a file on Public Holiday and check this file is archived
- Verifying Mondays transactional data received on Tuesday morning until on Saturday morning
Verifying Purging and Truncate Loads
I have already mentioned about Purging and Truncate loads in my earlier blogs.
Purging – The AutoSys jobs will Purge the data leaving the required data in staging table. Suppose if I have loaded 10th,11th ,12th of January data into staging table and when I load 13th of January data, the 10th of January data will be purged.
Truncate – Simple load day_01 data and when you load day_02 data they Day_01 data will be deleted
We as testers need to verify the Truncate and Purging is happening as per design requirement.
Verifying File Watcher Script
There will be File Watched Script that will look for files until it arrives the Unix Landing directory. Source system is promising us that they will send Day_01 file on 10-01-2013. So we have set the Date in File watcher Script. Source System sent the records on 10-01-2013 , now our File watcher Script will look the date from the file header, if both are matching then it will process the file into Staging table. Source system failed to send the data on 11-01-2013, our file watcher job will look for the file on 11-01-2013 for given time interval if its not arrived then automated Email will be sent to the concern source system saying the file is not arrived
So we as testers needs to verify the File watched job is working as expected.
Cheers – Asik.
Hope you guys had wonderful X-Mas holidays :-).
As we all know all the Inventions are made by Asking more and more questions.
If Newton would have not asked ‘Why Apple fallen down from the Tree’ then we would have not have Gravity Theory.
If Archimedes would have not asked ‘Why waters spilled out of the Tub when he was bathing?’ then we would have not have Archimedes principle.
Likewise if testers asks questions on applications like Why or How? if he/She finds the answer that is not matching with what it supposed to be then IT IS A Discovery, So we are Scientists 🙂
As far as I consider testing the application is asking questions about the application that I am testing – is perfect way of finding software anomalies.
Each functional defect that we stopping by our testing is just saving the production fixes but if we stop any Business Gaps then it saves the whole Business Need. So before kick of the testing make sure we know the Business Needs.
You will get a Functional Defect – If you ask How things are working? then it will be already mentioned in all the functional specifications ! if some functionality is missed out in the developed code can be found.
You will get a Req/ Specification / Design / Defect If you ask why things are working ? then you will need to check the Business Requirements, if you feel something is wrong then you will encounter few anomalies that may be from all the specifications related to that particular work request.
In this post I wanted to explain how Important the Domain knowledge is required for a Testers.
Domains like Banking, Health Care, Manufacturing, Insurance etc., All these domains are closely related to us.
To modernize these functional areas
> Business peoples will write Specs to cater the business needs as Business Requirement Documents.
> Considering the Business Specifications Solution Designers will prepare Functional Specification Documents.
> We testers and developers will refer the above documents and Develop and Test the application.
How you can learn Business easily ?
If you are working in a Banking Domain, you got Loan functionality to be tested in your Client website, then please create a Loan application in real time with your bank or with some other bank’s on-line application ,
If you are working in a Health Care Domain, you got Inventory functionality to be tested then go to nearby Chemist (Medical Shop) pick some medicines from here and there and go for billing and check how the shop keeper handling your goods.
Like above two examples, what ever business that you are testing, please do imagine that YOU ARE ALSO GOING TO USE THIS PRODUCT. Would you accept a Defective product from your manufacturer, ‘No’ right then your testing will be perfect.
I hope all the readers know about Validation and Verification ?
Let me tell you what I think about it,
Validation means, we need to verify all the documents that related to the given functionality are acceptable and valid.
Verification means, by validated specifications, the Code is written and its verified by us.
In Data warehouse world, the Specification documents are needs to be validated, because even a simple mistake will create a huge problem @ the end.
For an Example,
In warehouse we keep amount columns are in Negative (Bank Owe’s to us) as well as in Positive numbers (We Owe to bank).
Business Need – All the transactions of the day to be extracted
For extracts If specification documents asks us to pull the records where Balance > 0 then you will get the customers who are owing money to the bank.
So even a Single Symbol matters a lot !!! Before we start the Verification we need to Validate first!!!
Lets Discover along with Finding defects 🙂
Cheers – Asik
Have you created , Updated , Deleted Face book account to know about DWH concepts ? Today in this post let me explain you what is the necessity and importance of the Data types in ETL Testing.
We will start with our known examples :
Can you fill 10 liters of water into a 5 liters container?
“No, the container can have only 5 liters of water, if you fill more than its capacity then it will burst :-(”
Can you use Salt instead of sugar to make Tea?
“No, then every one will stop drinking Tea :-(”
Can we name a Kid using Numbers?
“No, if we keep numbers , then how many duplicate persons exists in this world ? just Imaging if I was named as 10215 !!!!
Can anyone have their Bank balance as absolute number ?
“No, because every money that you spent is fractional amount ! you cant have $ 5 all the time , it would be $ 5.28 most of the time.
Can you have your mobile number more than 10 digit ?or Can you have your mobile number as alphabets?
“No, because the mobile number length is pre-defined and the number cant be alphabets”
Like the above example
Our Source files, Source tables , Target tables are constructed with limitations. You cant have or keep the data that you want. You can keep or have the data that what system can accept.
In every programming we have this data types , most of them who reads this post knew about basics of Data types.
INTEGER, CHAR, VARCHAR, DECIMAL, FLOAT etc.,
Most of the time developers are testers encounters problems because of the data typing in Data warehouse world are ,
1. Correct Data type is not chosen in Source tables
2. Correct length of the data is not received from the source system in the source file
3. Source is not giving the values as per the data types mentioned
4. Data got truncated when loading it into Target tables.
5.The amount column precision is not populated correctly as Teradata changes it to make round off value.
6.Regarding Dates, source will send them as var-char but when we load it into target tables we keep as DATE and the format
The Data type and its length will be designed it its DDL – Data Definition Language . If you want to know about the tables properties then please use the blow query
a) ” SHOW TABLE Database.Table_Name ” – this will give you all about data types, data length. Not Null, Null, Primary Key definitions
b) ” HELP TABLE Database.Table_Name” – this will give you all about the table.
As a Tester what we need to verify ?
Again as I said,
Check the data is matching with the data type mentioned in the spec.
Check any data truncation happened when source data is loaded into Staging tables
Check the data is the staging tables are as per the Staging tables DDL
Check the target table columns are loaded as per the Target tables DDL.
If it a Varchar columns from source ,then please take care of the space , invalid characters etc., right from source till staging tables, because data stage will not accept special characters
If its a Decimal column then make sure the precision is carried out till the target load
If its a Integer column then make sure you should not get blanks ans spaces from source
If its is a CHAR then check the length of the character that fit into this column
Like above we can add as many as much scenarios for Data Type verification.
Hops this blog helps you to understand what is the importance of the Data Types in ETL testing,
See you in my next post
Cheers – Asik
In this blog I am going to explain how to write Test Conditions (What you are going to do?) Test Cases (How you are going to do?).
I am using a Excel Macro that will generate the TCN and TCA easily, the only thing is you need to configure.
I am not able to attach here – if you need it please drop me an EmailTo – email@example.com (please find the screen shots at the end of this blog)
One who doesn’t like when he asked to write test cases 😦 Let me explain you how to set up Test Conditions (TCN) and Test Cases (TCA) .
As I said already in my earlier blogs we have below types of DWH testing
1. Loads from file to staging tables to IM tables
2. Loads from table to staging tables to IM tables
3. DIMN/FACT projects
4. Extract projects
You need to know how to reuse your TCN and TCA ! write it for one project and use FIND and REPLACE for other projects !! confusing?
For the Loads from file to staging tables to IM tables we will set up TCN and TCA to validate Source file, verification of data load into Staging tables and verification of data transformation of load into target tables. Suppose you have prepared for PROJECT_A where Source is file is File_1.DAT and Staging table is Staging_Table1 and target tables is Target_Table then use the same test cases by replacing File_2.DAT and Staging table is Staging_Table2 and target tables is Target_Table2 for Project_B.
Don’t Understand ??
For project A I designed the steps and for Project B I replaced with attributes related to Project B
Source File Validations (Steps) :
Staging Loads :
Verification of Staging is very easy, It will be one to one mapping, loading all the data from Source file or system into Intermediate tables.
Staging Data into Target Tables:
Staging (more than one STG) data will be loaded into one or more than one Target tables. But we need to write test cases for each tables.
Please cover below scenarios:
1. Reconciliation check – record count between the STG tables and target tables are same after applying filter rules
2. Inserting a record which is not loaded into target table for given key combination
3.Copy records , sending same records (same key ) which is already loaded into target tables – should not be loaded
4. Updating a record for a Key when Value columns changed on Day_02 loads
5. Logically Delete the records in the Target tables
6. values loaded by ETL Code
7. Values loaded by Process Tables
8. Values loaded by Reference Tables
(If you get any special scenarios please add them )
Write test cases for Target Table_1 and reuse them for other tables.
In the Excel sheet , just configure the rules as you require feed the Source and Target column, and Click Create button it will give you the Test Case or Test Conditions as you configured.
Click on Create then 🙂
Hope you guys understood How to write test case and test conditions for BI Projects.
See you guys in my Next blog.