Extracts

Importance of Non Functional Testing in Data warehouse

Posted on Updated on

Hi All

In this post I would like to share my knowledge in Non Functional Testing in Data warehouse testing.

car

There are different types non-functional testing that we do in testing world, some of them is

  1. Baseline testing
  2. Compatibility testing
  3. Compliance testing
  4. Documentation testing
  5. Endurance testing
  6. Load testing
  7. Localization testing and Internationalization testing
  8. Performance testing
  9. Recovery testing
  10. Resilience testing
  11. Security testing
  12. Scalability testing
  13. Stress testing
  14. Usability testing
  15. Volume testing

To me Non Functional testing is something like which will not give any business values; It’s something like dealing with the environment. When we extract the data from heterogeneous source system, we might need to think of handling

Verifying the volume of the data

Any business can’t ensure what could be the volume of the data that they will send. They can say approximately, Our Code should have the capability of pulling the maximum number of data that they source system can send at any point of the time. To manage the Volume of the data, Teradata has the feature called M-Load and T-Pump. When developers designs the system they fix a limit by which data will be loaded into Warehouse.

Example:

  • M-Load – If we get a data file with 100 records then the records will be loaded by M-Load functionality
  • T-Pump – If we get a data file with less than 100 records then the records will be loaded by T-Pump

What we need to test here is, send a file with 100 records and check records are loaded by M-Load. This can be verified using the Load Job Names.

Verifying Date and Time of the Data file arrival to the Unix Landing directory

Most of the Companies will not function on Week Ends, Public Holidays so our source systems will not send any transactional data on those days. Because of the phenomenon developers will design their jobs to archive any files coming on these days.

Normally, Monday’s transactional data will come to us for loading on Tuesday early morning and it will end on Fridays transactional data will hit us on Saturday early morning.

We as testers need to verify these schedules are working as per the specification. This can be achieved

  • sending a file on Week End and check this file is archived
  • Sending a file on Public Holiday and check this file is archived
  • Verifying Mondays transactional data received on Tuesday morning until on Saturday morning

Verifying Purging and Truncate Loads

I have already mentioned about Purging and Truncate loads in my earlier blogs.

Purging –  The AutoSys jobs will Purge the data leaving the required data in staging table. Suppose if I have loaded 10th,11th ,12th of January data into staging table and when I load 13th of January data, the 10th of January data will be purged.

Truncate –  Simple load day_01 data and when you load day_02 data  they Day_01 data will be deleted

We as testers need to verify the Truncate and Purging is happening as per design requirement.

Verifying File Watcher Script

There will be File Watched Script that will look for files until it arrives the Unix Landing directory. Source system is promising us that they will send Day_01 file on 10-01-2013. So we have set the Date in File watcher Script. Source System sent the records on 10-01-2013 , now our File watcher Script will look the date from the file header, if both are matching then it will process the file into Staging table. Source system failed to send the data on 11-01-2013, our file watcher job will look for the file on 11-01-2013 for given time interval if its not arrived then automated Email will be sent to the concern source system saying the file is not arrived

So we as testers needs to verify the File watched job is working as expected.

Cheers – Asik.

Advertisements

Chapter 8 – Business Intelligence – Do you know how to Make Cofee? Then you know (E)xtract T(ransform) & L(oad)

Posted on Updated on

Hi All,

[I am using Civil Engineers and Coffee making scenarios for Better Understanding for Beginners]

Most widely spoken concept in Business Intelligence world is ETL – Extract Transform and Load. The persons who are in IT field already they call it as ETL Architecture 🙂 .

ETL Architecture :

Who use the word Architecture most commonly ? – Civil Engineers. They say Building Architecture so when you are constructing something you are using the word Architecture. You cant construct anything without prior planning. So a Civil engineer will draw a diagram that will tell how the building going to look like.

Likewise in Business Intelligence world, we have Solution Designers (Civil Engineers) who designs the ETL System and they are also called System Architectures!!!

So now you understood why we are calling it as ETL System Architect 🙂

Now let me explain how Extract Transform and Load process works.

How to make Coffee?

Asik in the below image is preparing coffee for his wife 😦 .

Image

In the first step, he is preparing decoction (extracting the essence of the coffee from Beans).

In the second step, he is transforming it to Coffee by adding correct proportion Milk,Sugar

In the third Step, he loaded the Coffee into Container for Drinking

After drinking the coffee (testing) he found (defects) he needs to add more sugar 😦 he added and ready for (reporting) studying.

Now how ETL works ?

Image

Developers involved in designing the Extract Transform Load Jobs. What is a Job? to make it understand ! we can say it is a Function.

These jobs will be written using Unix Scripting.

Image

Extract Job (decoction ready) will bring all the data into Staging tables – testers needs to verify all the data is moved from file to Staging table

Transform Job (added milk and sugar) will apply rules and filter the records , alters the records as per documents – testers needs to verify all the data transformed as expected

Load Jobs (coffee ready ) once the transform jobs ran successful then Load jobs will load the data into Intermediate tables (will be used for delta check in Day_02) – testers needs to verify the the transformed data is loaded into target tables

1.In Load job we have Insert as well as Delta

Hope you guys understood the ETL Logic!!!

In my next blog, I will explain about INSERTS , UPDATES , Logically DELETES

Cheers

Asik

Chapter 6 – Test Data Fabricating in Business Intelligence

Posted on Updated on

Hi All,

Thanks for viewing my blog so far ! I wanted to keep sharing my knowledge as much as I can – Happy Reading 🙂

In this blog I am going to explain about Test Data. In other testing practices you don’t need predefined test data, as we are the user who updated the values either in the Front End applications or some other applications. But in Data warehouse we are the ones who are completely depend on our Upstream (front end application) applications. We will test what all are the data that used in the Upstream guys.

Image

We testers are the intermediate guys who buys the data from one team and validate , transform and sell it to other guys 🙂

Image

Let me explain you in normal terms, you are doing a business , you need to buy the goods from your buyer and sell to u r customer!

What all you need to do?

– Need to buy quality products

– Need to check the count what you ordered

– Need to check the specification what you asked for?

– Need their support in fixing the issues

– if they cant fix it, then you need to repair it and sell it to your customer

Now come to our Business Intelligence scenarios,

As a tester what you need to do with the data ?

– Check the quality of the data (is as per the specifications)

– Check the count is correct using Trailer records (refer my Source Data validation Blog)

– Check if the data is not as per requirement then ask upstream guys (the people who gave the data to you) to fix the data issues

– Check you have got all the data that you required, if not ask your upstream to provide more data, suppose if the deny, please manufacture your data 😦

As I said already in my earlier blogs we have below types of DWH testing

1. Loads from file to staging tables to IM tables
2. Loads from table to staging tables to IM tables
3. DIMN/FACT projects
4. Extract projects
5. Reports
Test Data fabricating for Staging table to Target tables are quite easy, but for Extract Testing and Reports testing are bit difficult 😦

For most of the positive scenarios Upstream will provide the data for us !! but we tester wants to break the code so we will have few negative scenarios for which we will not get data from upstream. And for the scenarios which will happen once in a while we will not get the data so we need to create the data.

For Extracts, the Teradata SQL codes will be written by joining more than 10 tables , each tables will have its own WHERE clauses. After extracting the records using the SQL ,  you found that few scenarios are missing then you need to set up a data that needs to flow across all the tables.( Will be explained in my Know about Extracts Blog)

As I said in my Test Environment blog about the access, the testers should have Create table access in Test Environment and Select access to View Data Base.

So using Create Table access we can create our own test data as follows:

Please read the image very carefully, if you don’t understand please send me an EmailTo : aashiqamrita@gmail.com

Image

1 Create a TEMP table in you test Environment using Create Table Qry.

2. Insert sample records into your TEMP tables using INSERT INTO Qry.

3. Modify the data as per your requirement

4.Insert back the modified data into your actual table.

5.Your test data is ready to use 🙂

Hope you guys understood the test data manufacturing technique . If you guys need more support please comment in this blog 🙂

Thanks –

Happy Reading

Asik