Test Case

ETL Testing Roll Back post – Ask yourself and get ready to move to ETL Testing world

Posted on Updated on

Hello Folks

It had been long time I posted something through my Blog L , I’m really sorry for that 😦

jobinterview

I have met a person unknowingly in a common place, we were discussing about ETL testing and other stuffs and I told him I’m writing blogs about ETL Testing.  He was surprised and said that he is one visitor of my Blog. I am happy that I have met a unknown person who reads and benefited with my blog.

I have got few requests ‘Can you please share some of Interview Q & A?’

I replied them

“If you have hands on experience and if you would have mentioned in your CV about you skills and responsibilities then why you wanted to know about Q&A in ETL Testing”

Guys who are trying in ETL Testing domain, are thinking that if they know how to write SQL can clear the Interview and can get the job. Folks that’s not possible. I request you to think that ETL testing is not a different entity; your job is testing on whatever technology it is.

If you are taking an interview, please go through your CV and be strong on what you have mentioned. If you are asked other than what you have mentioned then please raise your hand ‘I’m sorry, I haven’t worked in it’

Ok let me come subject what I wanted to discuss here!!!

Here I’m going ask questions to myself to test my technical capability, if I win myself then I can win anywhere in the Globe, right guys?

So try to ask as much questions as possible to yourself!!!!

Q1

INTRV: Asik, Can you please let us know your roles and responsibilities in your previous project?

CAND: Hello, I’m from here and blaw, blaw, blaw………

Q2

INTRV: Asik, How you can scale yourself in writing in SQL from 1-5?

CAND:  [—-Fill your Best Answer here——]

Q3

INTRV: Asik, What all are the tools that you used in your ETL Projects?

CAND: [—-Fill your Best Answer here——]

Q4

INTRV: Asik, Can you please explain me the ETL architecture?

CAND: [—-Fill your Best Answer here——]

Q5

INTRV: What all are the Joins that you have worked so far? And can you please write down joins using two tables T1 and T2?

CAND: [—-Fill your Best Answer here——]

Q6

INTRV: Asik, what is the difference between Inner join and Outer Join?

CAND: [—-Fill your Best Answer here——]

Q7

INTRV: Asik, can you please explain me about the significance of Data Types in ETL Testing?

CAND: [—-Fill your Best Answer here——]

Q8

INTRV: Asik, I have a column as VARCHAR from source but corresponding target column is INTERGET, how you can write a query to check the target column?

CAND: [—-Fill your Best Answer here——]

Q9

INTRV: Asik, what is the significance of the Staging tables in the ETL loads?

CAND: [—-Fill your Best Answer here——]

Q10

INTRV: Asik, what is the difference between, copy records and duplicate records?

CAND: [—-Fill your Best Answer here——]

Q11

INTRV: Asik, what is the significance of Oracle Control tables in ETL Testing?

CAND: [—-Fill your Best Answer here——]

Q12

INTRV: Asik, what does it mean if you get a NULL, Blank, Space or 0 from Source?

CAND: [—-Fill your Best Answer here——]

Q13

INTRV: Asik, I have a VARCHAR column from source, this column can allow 0, suppose if source is sending 0000 instead of 0 , and the rule is we should not allow this record into target table. Does our SQL will stop this record if we write the condition as [Column <> 0]

CAND: [—-Fill your Best Answer here——]

Q14

INTRV: Asik, what is the difference between the Primary Index and Primary Key?

CAND: [—-Fill your Best Answer here——]

Q15

INTRV: Asik, what could be consequence if the reconciliation failed in the Day_01 load or Day_02 load?

CAND: [—-Fill your Best Answer here——]

Q16

INTRV: Asik, Why you wanted to test the DDL of any table?

CAND: [—-Fill your Best Answer here——]

Q17

INTRV: Asik, what is the difference between Data Definition Language and View Definition?

CAND: [—-Fill your Best Answer here——]

Q18

INTRV: Asik, as a tester what all are the access you required? And do you know the SQL query syntax to get the access?

CAND: [—-Fill your Best Answer here——]

 Q19

INTRV: Asik, what is the difference between DB level access, Table level access and view level access?

CAND: [—-Fill your Best Answer here——]

Q20

INTRV: Asik, What all are the steps you take to validate the Source Files? What is significance of Source File Names, Header, Trailer and Detailed Records?

CAND: [—-Fill your Best Answer here——]

Q21

INTRV: Asik, if you could not able to open a file manually by double click due to the file size? How you will access the file and look at the records?

CAND: [—-Fill your Best Answer here——]

Q22

INTRV: Asik, What is the High Severity defect that you raised so far in any ETL project and why you have classified as Sev-01?

CAND: [—-Fill your Best Answer here——]

Q22

INTRV: Asik, What is the High Priority defect that you raised so far in any ETL project and why you have classified as Sev-01?

CAND: [—-Fill your Best Answer here——]

Q23

INTRV: Asik, Explain about Inserts, Updates, Logical Deletes in any given ETL Projects?

CAND: [—-Fill your Best Answer here——]

Q24

INTRV: Asik, Why and what is necessity of Reference tables in any ETL Projects?

CAND: [—-Fill your Best Answer here——]

 Q25

INTRV: Asik, Can you modify Reference table data if you have Update access to the reference tables?

CAND: [—-Fill your Best Answer here——]

Q26

INTRV: Asik, What all are the Steps by which you can achieve the test data for a given scenarios?

CAND: [—-Fill your Best Answer here——]

Q27

INTRV: Asik, What is the difference between the Time Variant and Non time variant tables in any ETL Testing projects and how you will test it?

CAND: [—-Fill your Best Answer here——]

Q28

INTRV: Asik, Can you delete a table or Drop a table? What happened if we Drop a tables?

CAND: [—-Fill your Best Answer here——]

Q29

INTRV: Asik, What is the significance of Views in any ETL testing projects?

CAND: [—-Fill your Best Answer here——]

Q30

INTRV: Asik, Does a table contain more than one Primary Key? And Does a table contain more than one Primary Index?

CAND: [—-Fill your Best Answer here——]

Q31

INTRV: Asik, DBA has granted Select access to DB, does it mean that you have select access to all the tables sits in that DB?

CAND: [—-Fill your Best Answer here——]

 Q31

INTRV: Asik, if the target column is Integer (30) can you load a Varchar (30) into it?

CAND: [—-Fill your Best Answer here——]

Q32

INTRV: Asik, target Date column is constrained as DDMMYYYY but from source we are getting it as MMDDYYYY? What will happen to load job?

CAND: [—-Fill your Best Answer here——]

Q34

INTRV: Asik, What does it mean by Full-Snap shot and historical records?

CAND: [—-Fill your Best Answer here——]

Q35

INTRV: Asik, What does it mean Catch up logic involved in loads and how you will test it?

CAND: [—-Fill your Best Answer here——]

Q36

INTRV: Asik, What is the difference between Load projects and Extract projects?

CAND: [—-Fill your Best Answer here——]

Q37

INTRV: Asik, if the target column is defined as Decimal (4, 3), can you load the values as Decimal (4, 5)? Or can you load the values as Decimal (4, 3)

CAND: [—-Fill your Best Answer here——]

Q37

INTRV: Asik, What is the difference between Union and Union All? And where we want to use Union and where we want to use Union All?

CAND: [—-Fill your Best Answer here——]

 Q38

INTRV: Asik, What is the use of Cross Join? And what circumstances you use the Cross Join?

CAND: [—-Fill your Best Answer here——]

Q39

INTRV: Asik, What is the AutoSys comment to trigger the job, abort the job, hold the job?

CAND: [—-Fill your Best Answer here——]

Q40

INTRV: Asik, What is the significance of Group by functions in SQL and where we will use the Group by functions?

CAND: [—-Fill your Best Answer here——]

Q41

INTRV: Asik, How you can find the duplicates in the Staging tables?

CAND: [—-Fill your Best Answer here——]

Q42

INTRV: Asik, What is the difference between SET table and Multi Set tables? From where we can get these details?

CAND: [—-Fill your Best Answer here——]

Q43

INTRV: Asik, Can one View table refer another view table to create a new view table?

CAND: [—-Fill your Best Answer here——]

Q44

INTRV: Asik, Can one View table refer another view table to create a new view table?

CAND: [—-Fill your Best Answer here——]

 Q45

INTRV: Asik, What all are the common SQL Syntax errors you encountered and how you are resolved it?

CAND: [—-Fill your Best Answer here——]

Q46

INTRV: Asik, Do you have any idea what is Data Modeling?

CAND: [—-Fill your Best Answer here——]

Q47

INTRV: Asik, What is the difference between the Physical Design and Logical Design?

CAND: [—-Fill your Best Answer here——]

Q48

INTRV: Asik, What all are the Items available in a generic S2T?

CAND: [—-Fill your Best Answer here——]

Q49

INTRV: Asik, what does it mean if a column is mentioned as Value column in the S2T?

CAND: [—-Fill your Best Answer here——]

Q50

INTRV: Asik, Have I answered all of my basic questions to go to advanced questions?

CAND: I am still thinking and trying to answer few questions J

Advertisements

Chapter 7 – Data warehouse and Test Condition and Test Case design- Reuse or Automate using EXCEL

Posted on Updated on

Hi All,

In this blog I am going to explain how to write Test Conditions (What you are going to do?) Test Cases (How you are going to do?).

Image

I am using a Excel Macro that will generate the TCN and TCA easily, the only thing is you need to configure.

I am not able to attach here – if you need it please drop me an EmailTo – aashiqamrita@gmail.com (please find the screen shots at the end of this blog)

One who doesn’t like when he asked to write test cases 😦 Let me explain you how to set up Test Conditions (TCN) and Test Cases (TCA) .

As I said already in my earlier blogs we have below types of DWH testing

1. Loads from file to staging tables to IM tables
2. Loads from table to staging tables to IM tables
3. DIMN/FACT projects
4. Extract projects
5. Reports

You need to know how to reuse your TCN and TCA ! write it for one project and use FIND and REPLACE for other projects !! confusing?

For the Loads from file to staging tables to IM tables we will set up TCN and TCA to validate Source file, verification of data load into Staging tables and verification of data transformation of load into target tables. Suppose you have prepared for PROJECT_A where Source is file is File_1.DAT and Staging table is Staging_Table1 and target tables is Target_Table then use the same test cases by replacing File_2.DAT and Staging table is Staging_Table2 and target tables is Target_Table2 for Project_B.

Don’t Understand ??

For project A I designed the steps and for Project B I replaced with attributes related to Project B

Source File Validations (Steps) :

Image

Staging Loads :

Verification of Staging is very easy, It will be one to one mapping, loading all the data from Source file or system into Intermediate tables.

Image

Staging Data into Target Tables:

Staging (more than one STG) data will be loaded into one or more than one Target tables. But we need to write test cases for each tables.

Please cover below scenarios:

1. Reconciliation check – record count between the STG tables and target tables are same after applying filter rules

2. Inserting a record which is not loaded into target table for given key combination

3.Copy records , sending same records (same key ) which is already loaded into target tables – should not be loaded

4. Updating a record for a Key when Value columns changed on Day_02 loads

5. Logically Delete the records in the Target tables

6. values loaded by ETL Code

7. Values loaded by Process Tables

8. Values loaded by Reference Tables

(If you get any special scenarios please add them )

Write test cases for Target Table_1 and reuse them for other tables.

Image

In the Excel sheet , just configure the rules as you require feed the Source and Target column, and Click Create button it will give you the Test Case or Test Conditions as you configured.

Configuring :

Image

Click on Create then 🙂

Image

Hope you guys understood How to write test case and test conditions for BI Projects.

See you guys in my Next blog.

Cheers

Asik

Chapter 4 – Source Data – Static Verification

Posted on Updated on

Hi Guys ,

So we are ready with Test Environments, Test Plans, Test Cases to begin with Static Testing.

When we design our test cases we need to write test cases and test conditions right from Source validation till the Reports testing.

Image

1. Test cases to validated the Source Data .

What you wanted to do with source data ? Why it is important ? Why do we need to know the importance of the source Data ?

Let me explain you with normal Scenario!!!

You wanted to purchase a Mobile so what you will do , start analyzing the markets with different channels like [Suggestions from friends] ,[Surfing in Internet], [Checking in news Papers Advertisements], [Checking @ Outlets]. In this you will not get the same details from all of these channels. So we will note down all the data retrieved from different channels and choose what you need !!!

Like the above scenario, a customer who approaches an Organization will go through multiple channels and share multiple information but at the end what you need for your Reports will be Designed , Developed , Tested and Extracted by Us 🙂

As far as I know, if you are 90% OK with the source data then there are high possibility of not finding even a single defect @ any of the test phases unless care less mistakes or code defects found by developers. The defects leaked by Developers can be easily fixed because ETL code is under our control. If a defect is leaked because of the data then it is a huge impact to the projects because we need to go back to the source system and inform them to fix !!!!

What all you need to verify in Source data ?

Just think you are intermediate between a seller (Source) and a target (Seller) . Do you buy defective product from your buyer? Does your buyer buy defective products from you ? No 😦 so please ensure you are buying a non defective product and a selling a 90% defect free product to your customer.

Coming to Business Intelligence 😉

1.The data that we need will be sent to our address (Unix Landing Directory) tester should verify the data has arrived to the correct address (Landing Directory).

2.Our source system will tell us on what Format they are sending the files. So the tester should verify the Name of the file is as what source said to us.

3.Date component in the file name is critical for us, this date is the date when the file is extracted for us.

4.The Source data will be sent to use is different formats , mostly it will be .DAT (data) or .DLY (daily) files.

5.Along with DATA files they will send CTL (Empty file) to make sure all the files arrived -as Acknowledgement copy 🙂

6.Now you validated the address (Landing Directory) and File names.

7. Now open the file you will see Numbers, Alpha-Numeric, Alpha characters separated by [,] or [|] – you need to verify how its separated

8.Most of the files will have 3 sections – Header , Detail and Trailer records.Testers should verify these sections are retrieved as per the specifications.

9.As a tester you should be interested on the data how it is Organized as per the Specification? the order of the column is matching with the order given in Specification ? Data types of the columns are matching with the data types mention in the Specification?

Order of the column:

In the spec you have this Order |Name|Place|Number|DOB|  but in the file you are getting as  |Name|DOB|Number|Place| —Wohooo its a Defect 🙂

Number of the Columns :

In the spec you have this Order |Name|Place|Number|DOB|  but in the file you are getting as  |Name|DOB|Place| —Wohooo its a Defect 🙂

Data Type of the Columns:

In the spec you have this Order |Name as VARCHAR |Place as VARCHAR |Number as INTEGER |DOB as DATE | but in the file you are getting as |Name as VARCHAR |Place as VARCHAR |Number as CHAR|DOB as DATE | Wohooo its a Defect 🙂

File name specification :

What are the file name, how it should be named, and extension of the file (.DAT), business time stamp?
File name should be as per the Spec, because developers mention the exact file name in their Data Stage File watched job. If the file name is not as per the Spec then the Job fail due to the mismatch. The time stamp filed (Date + Time) should be validated, example the Spec says the Date should be DDMMYYYY if source is sending the date as MMDDYYYY then our Data Stage job fails due to Invalid Date.

File landing location :

Each project has specific location to X-Com the files from Source System. Testers need to verify the file is arrived
to the correct location.

(Note: If tester cannot able to view the file due to File size, he can view the file using Unix Commands in PUTTY
server)

File Header specification :

In general every file must have Header and the header record will contain when the file is processed. The testers
should aware of the Date in External file name and the date in the Header details.
i) Date in External file name: The date when the file is X-Com’d
ii) Date in Header record: The date when the file is processed.
Suppose if SAP system is extracting the file on 3/12/2013 and X-Com’ing on 4/12/2013, then External file name is
4/12/2013 and Header date is 3/12/2013. The Header is very important because Developer will use the Header
date as EFFECTIVE Date for Data loads.

File detailed records specification:

Detailed records section of the file contains the actual data that need to be loaded into Warehouse. Detail records
are separated by Pipe delimiter in general and it may change from source to source. Testers should be very
conscious about detailed records.

File Position Column Name Key Column Null / Not Null Data Type Filed Length
1 User ID Y N INTEGER 255
2 Dept ID Y N INTEGER 255
3 Name N Y VARCHAR 255
4 Contact Number N Y INTEGER 18
5 Address N Y VARCHAR 255
6 DOB N Y DATE 8

The testers should verify:
1.The columns are in the correct order as per the Spec

2.If the columns are mentioned as Key then source should send us only one record for that combination, if
find more than one record after loading into the Staging tables then raise defect

3.If the columns are mentioned as Not Null in the spec then we should not raise Nulls (||) from the source,
suppose if we get NULLs , our Data Stage job fails due to NULLs presents in Not Null Columns  , now
raise defect to Source system

4.Data type verification is the critical stuff, if a column is defined as INT then the source should send INT only
if not please raise defect to source system.

5. Filed length should be as per the spec, if the source sends less then what spec says then its OK, if it
exceeds then please raise defects to source system, because it will fail our Data Stage load jobs.

File Trailer record specification:

Trailers records will tell you how many records in the file. And the count could be including Header and Trailer. If
the file is related to transactional data then we might have a Hash total count column.
The tester should use the Trailer record count to verify the reconciliation between source file to staging table data.

So now we have validated our source 🙂 @ my next blog we are going to see how to deal with the Requirement documents for execution.

Thanks – Asik

Chapter 2- Business Intelligence Test Planning – [Estimation , Test Plan, Test Case Design]

Posted on Updated on

Test Planning phase is consisting of – Test Estimation, Test Plan preparation and Test Case design.

Image

Test Estimation

Test Estimation is an ART. Estimations is a variable component of SDLC, however it can be achieved based on experience. A person who estimates the project timelines should be technically sound enough.

Test Estimates are typically required to be produced at 3 stages

  1. Ball Park Estimate (BPE) should be produced in the Plan & Analyse phase of the SDLC.
  2. High Level Estimate (HLE) should be produced towards the end of High Level Solution Design in the Design phase.
  3. Detailed Estimate (DE) should be produced towards the end of Detailed Design in the Build phase

In Data warehouse world the estimations could be differ from each of the components mentioned below :

  • Loads from file to staging tables to Integrated Model testing
  • Loads from table to staging tables to IM tables testing
  • DIMN /FACT projects testing
  • Extract projects testing
  • Reports testing
  • Non-functional tests (T-Pump ,M-Load ,File watcher Catch up logic, Purging loads, Truncate loads

Test Plan :

Test Plan document identifies the risks to the program / project. Test analyst needs to ask as many as questions on the project and detail them as a document is called Test Plan. Please find common questions that you need to ask.

What you are testing? When you are starting and finishing testing? who is doing the testing? any risk? In Scope and Out Scope, who is giving data to you? to whom you are giving your inputs? who all are depend on you ? how you are depend on others? what files/tables/extract you are testing? how you are going to publish your reports? are you maintaining testing data in Quality Center ?

  • This is the ‘WHAT’ will be done. The Test Plan deals with the actions that will be taken to mitigate those risks;
  • This is the ‘HOW’ and ‘WHO’. The Test Plan may deal with a single test phase or multiple depending on the number of authors and the size/complexity of each phase;

– This will be identified in the Test Strategy. The typical test phases where Organization either author or review Test Plans
are System, Application Acceptance, System Integration, End-to-End and Production Assurance Testing Phases.

Test Case Design

The difference between a Test Condition and a Test Case is that a Test Condition is “what” we need to test and a Test Case is
“how” we will test the Test Condition.

  • Test Conditions are the ‘What’ we need to test and are derived from the identified requirements.
  • Test Conditions individually identify the individual functional behaviors that make up a requirement (Note: This differs from what is currently done, namely attempting to cover more than one function in the one Test Condition).

For Data warehouse projects we will have below test scenarios

1. Verify the Reconciliation – count between source and target are same after ETL Loads?

2. Verify Inserting a new record into warehouse – a brand new record which is not loaded into warehouse.

3.Verify how the Delta record is processed – Update record to an existing record in warehouse

4.Values that are loaded by ETL Code

5.Values that are loaded by Data Stage jobs

6.Deleting a record (expiring without updating)

Folks will explain each of them above in detail in upcoming blogs. Please provide feed backs to improve my blogs 🙂

Many Thanks – Asik