Chapter 3 – Setting up Test Environment for BI Projects

Posted on Updated on

Test Environment:

Image

In general BI Projects will have 3 environments

Production – where our tested code will function in real time

Development – where developers develops and Unit tests the code

Test Environment (System Test / SIT / E2E / UAT) – where the developed code will be deployed for testing

Setting up a Test Environment for Business Intelligence project is critical because the data for different level of testing is different. Developers will develop the ETL code in DEV (development) environment and when the testing phase kicks of they will point their codes to Testing environments.  

Why we need a separate environment? 

1. ACCESS and Test Environment

Because Developers are the one who designs the code and they will keep on changing the code until it works. We don’t have the version control in Data Stage or in Teradata. The development environment is the open space for all the Developers involved and there are high possibility of irregular updates on the code so the DEV environment is loosely controlled. Developers will have INSERT / UPDATE / DELETE access to all the designs in DEV and Test Environments. But testers will have only VIEW access on VIEWS (will explain what is Table and View in my next blog). Tester can only verify the data is as per ETL Code , they cant update any record to make them correct as per requirement. And developers also should agree what ever defects found in our environment should be open in DEV environments.

2. Data and Test Environments 

Data that used for Developing the code should be different to the data is used for testing why because developers used to create Test Data for UNIT testing, they are happy when the functionality is working fine as expected. So they are not interested source data quality. If any data causing trouble to them , they simply delete it and load the rest of the data. But testers should be very conscious about the data because data is the key for us to proceed the testing. So testers should have their own data in their own Environment. They should not depend on the source data that relies on Development environment.

 What all components needs access for a Tester ?

1. File Landing Directory – If your project is File to Table load then Source system will send the Files to a specified location. This location is different for both Development and Testing. Testers should get the access to this Directory.

2. Select ACCESS to Source Views – Tester can easily check whether they have Select access to the tables by simply querying the tables.If you are not having access to the tables then please create a Grants statement and send to your DBA.

[GRANT SEL ON DB_Name to User_ID;]

Good practice availing the access:

  1. A tester should run his queries in Views not in Tables, so Leads should make sure that the testers should not have Select access to tables before they start execution.
  2. Developers are creating Views over the tables, and when they deploy the code into PRODUCTION their code will points to the VIEWs not TABLEs.
  3. Developers might have introduced few filter statements in Views in order to stop duplicates, which might cause errors. Suppose if testers are tested the Target data using Tables then the errors related to Views could not be identified.
  4. Testers should not have Insert, Update, Delete access to Source and Target tables.
  5. Testers should have Create table access on Test Database for manufacturing data (will be explained in Test Data Management Blog)

Availability of the Source Data in Test Environment:

If the data load is into Existing warehouse tables then check the historical data is available in the Test environment by simply selecting the table. If there no records in test environment then request developer to copy sample records from PRODUCTION data into Test Environment.

[INSERT INTO TEST_DB.TABLE SEL * FROM PROD_DB.TABLE SAMPLE 1000;]

Q: Why we should have existing data into Test Environment if it is load into existing table?

A: Because our load into existing should not be delta records for the existing records. This can be verified using the record count       before and after the load – the count existing count should not be disturbed.
If the data load is into new warehouse tables then the tables should not contain any data in the Test Environment.

Q: Why we should not have any data  New target tables?

A: Because these tables do not exist in PRODUCTION and we are the one going to load the data into these tables.
So it should not contain any data.

Reference Tables in Test Environment 

MAP and TYPE tables are used for Referential Integrity. A project can use the existing MAP or TYPE tables or they
can create their own based on the project requirement

  1. Testers should verify all the MAP and TYPE tables mentioned in the S2T (even if it is not used in S2T transformation) are created in the Test Environment
  2. If the MAP or TYPE table exists then the testers should verify the data in Test Environment and PROD environment should be same.
  3. If the MAP or TYPE tables are created project specific then testers should verify the data is inserted correctly as per the Insert statements provide in S2T
  4. Tester should cross the values inserted into MAP and TYPE tables are matching with DDS (refer Appendix section for Reference data). If you find any data missing or added in the table please raise defect to Data designer.

Now we have everything in our Environment 🙂  Now we are going for Validation of Specification on next Blog.

See you @ my next Blog.

Regards – Asik

Chapter 2- Business Intelligence Test Planning – [Estimation , Test Plan, Test Case Design]

Posted on Updated on

Test Planning phase is consisting of – Test Estimation, Test Plan preparation and Test Case design.

Image

Test Estimation

Test Estimation is an ART. Estimations is a variable component of SDLC, however it can be achieved based on experience. A person who estimates the project timelines should be technically sound enough.

Test Estimates are typically required to be produced at 3 stages

  1. Ball Park Estimate (BPE) should be produced in the Plan & Analyse phase of the SDLC.
  2. High Level Estimate (HLE) should be produced towards the end of High Level Solution Design in the Design phase.
  3. Detailed Estimate (DE) should be produced towards the end of Detailed Design in the Build phase

In Data warehouse world the estimations could be differ from each of the components mentioned below :

  • Loads from file to staging tables to Integrated Model testing
  • Loads from table to staging tables to IM tables testing
  • DIMN /FACT projects testing
  • Extract projects testing
  • Reports testing
  • Non-functional tests (T-Pump ,M-Load ,File watcher Catch up logic, Purging loads, Truncate loads

Test Plan :

Test Plan document identifies the risks to the program / project. Test analyst needs to ask as many as questions on the project and detail them as a document is called Test Plan. Please find common questions that you need to ask.

What you are testing? When you are starting and finishing testing? who is doing the testing? any risk? In Scope and Out Scope, who is giving data to you? to whom you are giving your inputs? who all are depend on you ? how you are depend on others? what files/tables/extract you are testing? how you are going to publish your reports? are you maintaining testing data in Quality Center ?

  • This is the ‘WHAT’ will be done. The Test Plan deals with the actions that will be taken to mitigate those risks;
  • This is the ‘HOW’ and ‘WHO’. The Test Plan may deal with a single test phase or multiple depending on the number of authors and the size/complexity of each phase;

– This will be identified in the Test Strategy. The typical test phases where Organization either author or review Test Plans
are System, Application Acceptance, System Integration, End-to-End and Production Assurance Testing Phases.

Test Case Design

The difference between a Test Condition and a Test Case is that a Test Condition is “what” we need to test and a Test Case is
“how” we will test the Test Condition.

  • Test Conditions are the ‘What’ we need to test and are derived from the identified requirements.
  • Test Conditions individually identify the individual functional behaviors that make up a requirement (Note: This differs from what is currently done, namely attempting to cover more than one function in the one Test Condition).

For Data warehouse projects we will have below test scenarios

1. Verify the Reconciliation – count between source and target are same after ETL Loads?

2. Verify Inserting a new record into warehouse – a brand new record which is not loaded into warehouse.

3.Verify how the Delta record is processed – Update record to an existing record in warehouse

4.Values that are loaded by ETL Code

5.Values that are loaded by Data Stage jobs

6.Deleting a record (expiring without updating)

Folks will explain each of them above in detail in upcoming blogs. Please provide feed backs to improve my blogs 🙂

Many Thanks – Asik

Chapter 1 – Business Intelligence and Testing

Posted on Updated on

Business Intelligence and Testing!!!!

Intelligence is a very general mental capability that, involves the ability to reason, plan, solve problems, comprehend complex ideas, learn quickly and learn from experience.

Further simply put – It is “catching on,” “making sense” of things, or “figuring out” what to do

Business Intelligence is collecting the data from various sources of any organization to aid in planning, solving business problems and learning from experience there by taking proper decisions to expand its business and maximize profits.

Lets understand what a Warehouse is and how we can connect it to Information and Technology!!

Consider a person wants to set up a Provisional Store and he wants to know how to stock the Goods to maintain Supply and Demand life cycle. He’ll start a shop in a small room and gets goods in a limited quantity. 100 kg rice, 50 kg wheat, 100 kg dal, 3-4 varieties chocolates, 3-4 varieties of soaps and starts his business.

People start coming to his store and start buying things. In 15 days his stock is almost sold and then he analyses how the sales were. To achieve this he will maintain all the data on a paper and calculate (Add, Subtract, Multiply, Divide)

He’ll continue this for sometime and forecast the data he collected.

  • If analysis says that the goods are sold as per his calculations and he’s making profit then he’ll start getting goods in large quantities. Since the store is very small he cannot accommodate goods in bulk in the store so he will setup a warehouse where he can store all the goods. And small amount of goods required for a month, he’ll move them to the provisional store for selling.
  • If analysis says that the few goods in his shop are not sold and he is not making profit as expected but he is not making loss, then he can change the goods he is selling and try his analysis to ear profits
  • If analysis says that the goods in his shop are not sold at all and he is in huge loss then he might have to stop his business.

As a shopkeeper he needs to do analysis on the Goods available and the sales in the warehouse to continue or stop buying things (forecasting), he creates different racks to keep different goods to access them easily and very quickly. He should know where he stored the goods and how they are related say for an example. He would store all the Groceries in one rack, Home Maintenance goods in one rack, canned foods in one rack etc., if he stores the goods logically and physically he will be able to get to know about the Stock easily.

Basic analysis to be done by the shopkeeper before he decides to setup a warehouse.
How is the Demand and supply in the warehouse? To answer this he needs to answer the following questions.

  • Find out the existing stock of goods
  • Does he need new goods                                            – Inserting  new records in Data warehouse
  • Does he need more number of existing goods  -Updating of records in Data warehouse
  • Does he need to remove goods which or not in demand  – Deleting of records in Data warehouse

If a small provisional store needs to decide the progress based on the available data we can imagine the power of data in any big organization. 10 years ago the cost of a 500 MB USB Drive was sold for INR 2500.00, but now we can get 500 GB Flash Drive for INR 2500.00

Image

As on today we have 7,195,883,823 population in this world. Most of them are connected with one or more Public or Private organization , it could be Banks, Schools, Colleges, Employees, Employers etc.,. Like the above shop keeper the organization needs to maintain the Data relates to 7,195,883,823 people. So Experts in Information and Technology industries are developing robust techniques to maintain and validate the data.

Just think 10 years ago the cost of a 500 MB USB Drive was sold for INR 2500.00, but now we are getting 500 GB Flash Drive for INR 2500.00 – what does it mean ? the data is increased !! so we need more space for lesser price !!!

All organizations are very much interested on Customer data to enhance and forecast their business. So they are seeking Information and Technology specialists for designing. Technology specialist are proposing solutions at various levels to capture and maintain the data for Analysis and Forecasting.

Image

In today’s world, all of us are connected through internet be it social networking, ecommerce, online banking etc. So the experts in Information and Technology industries are developing robust techniques to maintain and validate such huge data and assist the businesses to completely change the way they have been working.

Just like analysis of Warehouse data and the trends of customer’s buying trends help a shopkeeper to plan his strategy to buy goods to maximize the profits, any organization needs to analyze the data of its own organization (Internal data about various departments and people in the organization) and the Customer data to forecast its business and strategize to maximize profits. So they are seeking Information and Technology specialists’ expertise for designing a data warehouse. Technology specialists are proposing solutions at various levels to capture and maintain the data for Analysis and Forecasting.

This gave birth to a closed group called Business Intelligence (Group Information System). There are many roles involved to capture the customer data. Let’s discuss responsibilities of few of them.

  1. Solution Architect who designs the whole System right from
    1. How the data is retrieved from source?
    2. How the retrieved data is loaded into Warehouse?
    3. How the data is used for analytic purposes?
  1. Business Analysts who
    1. Analyze the business needs
    2. Create the rules for capturing the right data from the source systems.

3.  Data Analysts who design how the source system and target (warehouse) interact with each other.

  1. What data can be loaded?
  2. How should it be sent into the data warehouse?

4. Developers who convert business needs into coding that will

a. Extract the data from the source system

b. Load the data into warehouse based on the Rules provided by Data    modelers and business analysts

  1. Testers who validate

a. whether source is sending the data what they agreed.

b. Validating the data received based on the business needs

c. Validating the transformation rules defined by the business analysts in Source to target document

d. Validating the code developed by developers are as per the business / transformation rules provided.

Folks – please comment how useful the write up is!!! I will be posting about Test Planning (Estimation,Test Plan, Test Cases and Test Conditions) in my next blog.

Many Thanks – Asik

This post is updated  with the valuable inputs of  Harika Dommeti (Thanks :-))