INFORMATICA INTERVIEW QUESTIONS: 2014

What are the changes we observe when we promote a non resuable Sequence Generator to a resuable one? And what happens if we set the Number of Cached Values to 0 for a reusable transformation?

Ans. When we convert a non reusable sequence generator to resuable one we observe that the Number of Cached Values is set to 1000 by default; And the Reset property is disabled.
When we try to set the Number of Cached Values property of a Reusable Sequence Generator to 0 in the Transformation Developer we encounter the following error message:
The number of cached values must be greater than zero for reusable sequence transformation.

Suppose we have 100 records coming from the source. Now for a target column population we used a Sequence generator. Suppose the Current Value is 0 and End Value of Sequence generator is set to 80. What will happen?

Ans. End Value is the maximum value the Sequence Generator will generate. After it reaches the End value the session fails with the following error message:
TT_11009 Sequence Generator Transformation: Overflow error.
Failing of session can be handled if the Sequence Generator is configured to Cycle through the sequence, i.e. whenever the Integration Service reaches the configured end value for the sequence, it wraps around and starts the cycle again, beginning with the configured Start Value.

Suppose we have a source table populating two target tables. We connect the NEXTVAL port of the Sequence Generator to the surrogate keys of both the target tables. Will the Surrogate keys in both the target tables be same? If not how can we flow the same sequence values in both of them.

Ans. When we connect the NEXTVAL output port of the Sequence Generator directly to the surrogate key columns of the target tables, the Sequence number will not be the same.
A block of sequence numbers is sent to one target tables surrogate key column. The second targets receives a block of sequence numbers from the Sequence Generator transformation only after the first target table receives the block of sequence numbers.
Suppose we have 5 rows coming from the source, so the targets will have the sequence values as TGT1 (1,2,3,4,5) and TGT2 (6,7,8,9,10). [Taken into consideration Start Value 0, Current value 1 and Increment by 1.
Now suppose the requirement is like that we need to have the same surrogate keys in both the targets.
Then the easiest way to handle the situation is to put an Expression Transformation in between the Sequence Generator and the Target tables. The SeqGen will pass unique values to the expression transformation, and then the rows are routed from the expression transformation to the targets.

Define the Properties available in Sequence Generator transformation in brief.

Ans.

Sequence Generator Properties	Description
Start Value	Start value of the generated sequence that we want the Integration Service to use if we use the Cycle option. If we select Cycle, the Integration Service cycles back to this value when it reaches the end value. Default is 0.
Increment By	Difference between two consecutive values from the NEXTVAL port.Default is 1.
End Value	Maximum value generated by SeqGen. After reaching this value the session will fail if the sequence generator is not configured to cycle.Default is 2147483647.
Current Value	Current value of the sequence. Enter the value we want the Integration Service to use as the first value in the sequence. Default is 1.
Cycle	If selected, when the Integration Service reaches the configured end value for the sequence, it wraps around and starts the cycle again, beginning with the configured Start Value.
Number of Cached Values	Number of sequential values the Integration Service caches at a time. Default value for a standard Sequence Generator is 0. Default value for a reusable Sequence Generator is 1,000.
Reset	Restarts the sequence at the current value each time a session runs.This option is disabled for reusable Sequence Generator transformations.

What is a Sequence Generator Transformation?

Ans. A Sequence Generator transformation is a Passive and Connected transformation that generates numeric values. It is used to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers. This transformation by default contains ONLY Two OUTPUT ports namely CURRVAL and NEXTVAL. We cannot edit or delete these ports neither we cannot add ports to this unique transformation. We can create approximately two billion unique numeric values with the widest range from 1 to 2147483647.

Suppose we have the EMP table as our source. In the target we want to view those employees whose salary is greater than or equal to the average salary for their departments. Describe your mapping approach.

Our Mapping will look like this:

ahref="http://png.dwbiconcepts.com/images/tutorial/info_interview/info_interview10.png" Mapping using Joiner

To start with the mapping we need the following transformations:
After the Source qualifier of the EMP table place a Sorter Transformation . Sort based on DEPTNOport.

Next we place a Sorted Aggregator Transformation. Here we will find out the AVERAGE SALARY for each (GROUP BY) DEPTNO.
When we perform this aggregation, we lose the data for individual employees.
To maintain employee data, we must pass a branch of the pipeline to the Aggregator Transformation and pass a branch with the same sorted source data to the Joiner transformation to maintain the original data.
When we join both branches of the pipeline, we join the aggregated data with the original data.

So next we need Sorted Joiner Transformation to join the sorted aggregated data with the original data, based on DEPTNO. Here we will be taking the aggregated pipeline as the Master and original dataflow as Detail Pipeline.

After that we need a Filter Transformation to filter out the employees having salary less than average salary for their department.
Filter Condition: SAL>=AVG_SAL

Lastly we have the Target table instance.

What are the transformations that cannot be placed between the sort origin and the Joiner transformation so that we do not lose the input sort order.

Ans. The best option is to place the Joiner transformation directly after the sort origin to maintain sorted data. However do not place any of the following transformations between the sort origin and the Joiner transformation:

Custom
UnsortedAggregator
Normalizer
Rank
Union transformation
XML Parser transformation
XML Generator transformation
Mapplet [if it contains any one of the above mentioned transformations]

Suppose we configure Sorter transformations in the master and detail pipelines with the following sorted ports in order: ITEM_NO, ITEM_NAME, PRICE. When we configure the join condition, what are the guidelines we need to follow to maintain the sort order?

Ans. If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO, ITEM_NAME and PRICE we must ensure that:

Use ITEM_NO in the First Join Condition.
If we add a Second Join Condition, we must use ITEM_NAME.
If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME in the Second Join Condition.
If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the Integration Service fails the session.

How does Joiner transformation treat NULL value matching.

Ans. The Joiner transformation does not match null values.
For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not consider them a match and does not join the two rows.
To join rows with null values, replace null input with default values in the Ports tab of the joiner, and then join on the default values.
Note: If a result set includes fields that do not contain data in either of the sources, the Joiner transformation populates the empty fields with null values. If we know that a field will return a NULL and we do not want to insert NULLs in the target, set a default value on the Ports tab for the corresponding port.

Describe the impact of number of join conditions and join order in a Joiner Transformation.

We can define one or more conditions based on equality between the specified master and detail sources. Both ports in a condition must have the same datatype. If we need to use two ports in the join condition with non-matching datatypes we must convert the datatypes so that they match. The Designer validates datatypes in a join condition.
Additional ports in the join condition increases the time necessary to join two sources.
The order of the ports in the join condition can impact the performance of the Joiner transformation. If we use multiple ports in the join condition, the Integration Service compares the ports in the order we specified.
NOTE: Only equality operator is available in joiner join condition.

Define the various Join Types of Joiner Transformation.

In a normal join , the Integration Service discards all rows of data from the master and detail source that do not match, based on the join condition.
A master outer join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
A detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
A full outer join keeps all rows of data from both the master and detail sources.

What are the different types of Joins available in Joiner Transformation?

In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation is similar to an SQL join except that data can originate from different types of sources The Joiner transformation supports the following types of joins :

Normal
Master Outer
Detail Outer
Full Outer

Join Type property of Joiner Transformation class="caption"

Note: A normal or master outer join performs faster than a full outer or detail outer join.

Out of the two input pipelines of a joiner, which one will you set as the master pipeline?

During a session run, the Integration Service compares each row of the master source against the detail source. The master and detail sources need to be configured for optimal performance. To improve performance for an Unsorted Joiner transformation, use the source with fewer rows as the master source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
When the Integration Service processes an unsorted Joiner transformation, it reads all master rows before it reads the detail rows. The Integration Service blocks the detail source while it caches rows from the master source. Once the Integration Service reads and caches all master rows, it unblocks the detail source and reads the detail rows.
To improve performance for a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source.
When the Integration Service processes a sorted Joiner transformation, it blocks data based on the mapping configuration and it stores fewer rows in the cache, increasing performance.
Blocking logic is possible if master and detail input to the Joiner transformation originate from different sources. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache.

State the limitations where we cannot use Joiner in the mapping pipeline.

The Joiner transformation accepts input from most transformations. However, following are the limitations:

Joiner transformation cannot be used when either of the input pipeline contains an Update Strategy transformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation directly before the Joiner transformation.

What is a Joiner Transformation and why it is an Active one?

A Joiner is an Active and Connected transformation used to join source data from the same source system or from two related heterogeneous sources residing in different locations or file systems.
The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.
The two input pipelines include a master pipeline and a detail pipeline or a master and a detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline continues to the target.
In the Joiner transformation, we must configure the transformation properties namely Join Condition, Join Type and Sorted Input option to improve Integration Service performance.
The join condition contains ports from both input sources that must match for the Integration Service to join two rows. Depending on the type of join selected, the Integration Service either adds the row to the result set or discards the row.
The Joiner transformation produces result sets based on the join type, condition, and input data sources. Hence it is an Active transformation.

Suppose we have two Source Qualifier transformations SQ1 and SQ2 connected to Target tables TGT1 and TGT2 respectively. How do you ensure TGT2 is loaded after TGT1?

Ans. If we have multiple Source Qualifier transformations connected to multiple targets, we can designate the order in which the Integration Service loads data into the targets.
In the Mapping Designer, We need to configure the Target Load Plan based on the Source Qualifier transformations in a mapping to specify the required loading order.

Suppose we have a Source Qualifier transformation that populates two target tables. How do you ensure TGT2 is loaded after TGT1?

Ans. In the Workflow Manager, we can Configure Constraint based load ordering for a session. The Integration Service orders the target load on a row-by-row basis. For every row generated by an active source, the Integration Service loads the corresponding transformed row first to the primary key table, then to the foreign key table.
Hence if we have one Source Qualifier transformation that provides data for multiple target tables having primary and foreign key relationships, we will go for Constraint based load ordering.

What is the maximum number we can use in Number Of Sorted Ports for Sybase source system.

Ans. Sybase supports a maximum of 16 columns in an ORDER BY clause. So if the source is Sybase, do not sort more than 16 columns.

Describe the scenarios where we go for Joiner transformation instead of Source Qualifier transformation.

Ans. While joining Source Data of heterogeneous sources as well as to join flat files we will use the Joiner transformation. Use the Joiner transformation when we need to join the following types of sources:

Join data from different Relational Databases.
Join data from different Flat Files.
Join relational sources and flat files.

What happens if in the Source Filter property of SQ transformation we include keyword WHERE say, WHERE CUSTOMERS.CUSTOMER_ID > 1000.

Ans. We use source filter to reduce the number of source records. If we include the string WHERE in the source filter, the Integration Service fails the session.

What will happen if the SELECT list COLUMNS in the Custom override SQL Query and the OUTPUT PORTS order in SQ transformation do not match?

Ans. Mismatch or Changing the order of the list of selected columns to that of the connected transformation output ports may result is session failure.

Describe the situations where we will use the Source Filter, Select Distinct and Number Of Sorted Ports properties of Source Qualifier transformation.

Ans. Source Filter option is used basically to reduce the number of rows the Integration Service queries so as to improve performance.
Select Distinct option is used when we want the Integration Service to select unique values from a source, filtering out unnecessary data earlier in the data flow, which might improve performance.
Number Of Sorted Ports option is used when we want the source data to be in a sorted fashion so as to use the same in some following transformations like Aggregator or Joiner, those when configured for sorted input will improve the performance.

Suppose we have used the Select Distinct and the Number Of Sorted Ports property in the SQ and then we add Custom SQL Query. Explain what will happen.

Ans. Whenever we add Custom SQL or SQL override query it overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation. Hence only the user defined SQL Query will be fired in the database and all the other options will be ignored .

What happens to a mapping if we alter the datatypes between Source and its corresponding Source Qualifier?

Ans. The Source Qualifier transformation displays the transformation datatypes. The transformation datatypes determine how the source database binds data when the Integration Service reads it.
Now if we alter the datatypes in the Source Qualifier transformation or the datatypes in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping as invalid when we save it.

What is a Source Qualifier? What are the tasks we can perform using a SQ and why it is an ACTIVE transformation?

Ans. A Source Qualifier is an Active and Connected Informatica transformation that reads the rows from a relational database or flat file source.

We can configure the SQ to join [Both INNER as well as OUTER JOIN] data originating from the same source database.
We can use a source filter to reduce the number of rows the Integration Service queries.
We can specify a number for sorted ports and the Integration Service adds an ORDER BY clause to the default SQL query.
We can choose Select Distinctoption for relational databases and the Integration Service adds a SELECT DISTINCT clause to the default SQL query.
Also we can write Custom/Used Defined SQL query which will override the default query in the SQ by changing the default settings of the transformation properties.
Also we have the option to write Pre as well as Post SQL statements to be executed before and after the SQ query in the source database.

Since the transformation provides us with the property Select Distinct, when the Integration Service adds a SELECT DISTINCT clause to the default SQL query, which in turn affects the number of rows returned by the Database to the Integration Service and hence it is an Active transformation.

How is the union transformation active transformattion

         Active Transformation: the transformation that change the no. of   rows in the Target.
            Source (100 rows) ---> Active Transformation ---> Target (< or > 100 rows)Passive Transformation: the transformation that does not change the
            no. of rows in the Target.
            Source (100 rows) ---> Passive Transformation ---> Target (100 rows)
            Union Transformation: in Union Transformation we may combine the data from two (or) more sources. Assume Table-1 contains '10' rows   and Table-2 contains '20' rows. If we combine the rows of Table-1 and Table-2 we will get a total of '30' rows in the Target. So it is definetly an Active Transformation.

What is a Union Transformation?

Ans. The Union transformation is an Active, Connected non-blocking multiple input group transformation use to merge data from multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union transformation does not remove duplicate rows.

What are the restrictions of Union Transformation?

All input groups and the output group must have matching ports. The precision, datatype, and scale must be identical across all groups.
We can create multiple input groups, but only one default output group.
The Union transformation does not remove duplicate rows.
We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.
The Union transformation does not generate transactions.

How does a Sorter Cache works?

Ans. The Integration Service passes all incoming data into the Sorter Cache before Sorter transformation performs the sort operation.
The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine.
If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory.

How does Sorter handle NULL values?

Ans. We can configure the way the Sorter transformation treats null values. Enable the property Null Treated Low if we want to treat null values as lower than any other value when it performs the sort operation. Disable this option if we want the Integration Service to treat null values as higher than any other value.

How does Sorter handle Case Sensitive sorting?

Ans. The Case Sensitive property determines whether the Integration Service considers case when sorting data. When we enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters.

Why is Sorter an Active Transformation?

Ans. When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.

What is a Sorter Transformation?

Ans. Sorter Transformation is an Active, Connected Informatica transformation used to sort data in ascending or descending order according to specified sort keys. The Sorter transformation contains only input/output ports.

What is a RANK port and RANKINDEX?

Ans. Rank port is an input/output port use to specify the column for which we want to rank the source values. By default Informatica creates an output port RANKINDEX for each Rank transformation. It stores the ranking position for each row in a group.

How does a Rank Transform differ from Aggregator Transform functions MAX and MIN?

Like the Aggregator transformation, the Rank transformation lets us group information. The Rank Transform allows us to select a group of top or bottom values, not just one value as in case of Aggregator MAX, MIN functions.

What is a Rank Transform?

Rank is an Active Connected Informatica transformation used to select a set of top or bottom values of data.