Top TCS Informatica ETL Interview Questions and Answers (2024)

As one of the leading IT companies in India, TCS (Tata Consultancy Services) frequently hires Informatica ETL developers to work on various data integration projects. If you’re preparing for an Informatica ETL interview at TCS, it’s crucial to be well-versed in both theoretical and practical aspects of this powerful ETL tool.

In this comprehensive article, we’ll cover some of the most commonly asked Informatica interview questions by TCS, along with their detailed answers. We’ll explore topics ranging from basic Informatica concepts to advanced scenarios, ensuring you’re fully prepared to tackle any challenge that comes your way.

Informatica Basic Interview Questions

  1. What do you mean by Enterprise Data Warehouse?

An Enterprise Data Warehouse (EDW) is a centralized repository that stores and manages enterprise data collected from multiple sources. It’s designed to facilitate data analysis, business intelligence, decision-making, and deriving valuable insights across the organization. EDWs allow users with proper privileges to access and utilize data from a single point of entry, streamlining data delivery through a single source.

  1. What is ETL (Extract, Transform, Load), and write some ETL tools?

ETL stands for Extract, Transform, and Load. It’s a process that involves extracting data from various sources, transforming it into a desired format, and loading it into a target database or data warehouse. Some popular ETL tools include:

  • IBM DataStage
  • Informatica PowerCenter
  • Abinitio
  • Talend Studio

The primary functions of an ETL tool are:

  • Extracting data from sources
  • Analyzing, transforming, and cleaning data
  • Indexing and summarizing data
  • Loading data into the target data warehouse
  • Monitoring changes in source data
  • Restructuring keys
  • Tracking metadata
  • Updating data in the data warehouse
  1. What is Informatica PowerCenter?

Informatica PowerCenter is a powerful data integration tool that enables organizations to extract, transform, and load data from heterogeneous OLTP (Online Transaction Processing) systems into an enterprise data warehouse. It’s widely used by companies like US Air Force, Allianz, Fannie Mae, ING, and Samsung for building robust data warehouses.

PowerCenter consists of seven essential components:

  • PowerCenter Service
  • PowerCenter Clients
  • PowerCenter Repository
  • PowerCenter Domain
  • Repository Service
  • Integration Service
  • PowerCenter Administration Console
  • Web Service Hub
  1. Name different types of transformations that are important in Informatica.

Informatica provides various transformations to perform specific functionalities and transform source data according to the target system’s requirements while maintaining data quality. Some important transformations include:

  • Aggregator Transformation: An active transformation used to compute averages, sums, and other aggregations across multiple rows or groups.
  • Expression Transformation: A passive transformation suitable for calculating values in a single row and testing conditional statements.
  • Filter Transformation: An active transformation used for filtering rows that don’t meet a given condition.
  • Joiner Transformation: An active transformation that joins data from different sources or the same location.
  • Lookup Transformation: Used to retrieve relevant data by looking up a source, source qualifier, or target.
  • Normalizer Transformation: An active transformation used for normalizing de-normalized Cobol source data into multiple rows.
  • Rank Transformation: An active transformation used to select top or bottom rankings.
  • Router Transformation: An active transformation that provides multiple conditions for testing source data.
  • Sorter Transformation: An active transformation that sorts data based on a field in ascending or descending order.
  • Source Qualifier Transformation: An active transformation that reads rows from a flat file or relational source and transforms them into Informatica native data types.
  1. Write the difference between connected lookup and unconnected lookup.
Connected Lookup Unconnected Lookup
Receives input values directly from the transformation and contributes to the data flow. Does not receive input values directly; it only receives them from the result or function of the lookup expression.
Connected to the database for synchronization. No synchronization technique is implemented.
Expressions and other transformations cannot be performed. Can be used in any transformation, although it does not take input directly from other transformations.
Cannot be called more than once in a mapping. Can be called multiple times in a mapping.
Supports user-defined default values. Does not support user-defined default values.
Supports both dynamic and static caches. Supports only static caches.
Can return more than one column value (output port). Can return only one column value.

Informatica Scenario-Based Interview Questions

  1. What are the different ways of parallel processing in Informatica?

Informatica supports parallel processing to improve performance by processing data in parallel. The different partitioning algorithms used for parallel processing are:

  • Database Partitioning: Queries the database for table partition information and reads partitioned data from corresponding nodes.
  • Round-Robin Partitioning: Evenly distributes data across all partitions, facilitating correct data grouping.
  • Hash Auto-keys Partitioning: Groups data rows across partitions using a hash auto-key.
  • Hash User-Keys Partitioning: Groups data rows based on a user-defined partition key.
  • Key Range Partitioning: Passes data based on a specified range for each partition using one or more ports as compound partition keys.
  • Pass-through Partitioning: Passes all rows without redistributing them from one partition point to another.
  1. State the difference between mapping parameters and mapping variables.
Mapping Parameters Mapping Variables
Constant values set in parameter files before a session is run and remain the same until the session ends. Values that change throughout the session and are saved to the repository by the Integration Service for the next session run.
To change a parameter value, the parameter file must be updated between session runs. Can be changed using functions like SetMaxVariable, SetMinVariable, SetVariable, SetCountVariable.
  1. What is OLAP, and write its types?

OLAP (Online Analytical Processing) is a method used to perform multidimensional analyses on large volumes of data from multiple database systems simultaneously. In addition to managing large amounts of historical data, it provides aggregation and summation capabilities, as well as storing information at different levels of granularity to assist in decision-making.

The types of OLAP include:

  • DOLAP (Desktop OLAP)
  • ROLAP (Relational OLAP)
  • MOLAP (Multidimensional OLAP)
  • HOLAP (Hybrid OLAP)
  1. What is the scenario in which the Informatica server rejects files?

The Informatica server rejects files when it encounters rejections in the Update Strategy transformation, which can disrupt data and information in the database. This is a rare scenario or situation.

  1. What do you mean by a surrogate key?

A surrogate key, also known as an artificial key or identity key, is a system-generated identifier used to uniquely identify each record in a dimension table. Surrogate keys replace natural primary keys, making it easier to update the table and preserve historical information in Slowly Changing Dimensions (SCDs).

  1. Give a few mapping design tips for Informatica.

Here are some mapping design tips for Informatica:

  • Standards: Follow consistent naming conventions, environmental settings, documentation, and parameter file standards for long-term project benefits.
  • Reusability: Use reusable components like mapplets, worklets, and transformations to react quickly to potential changes.
  • Scalability: Consider scalability while designing mappings to ensure the volume is correct.
  • Simplicity: Opt for multiple simple mappings instead of one complex mapping for better clarity and maintainability.
  • Modularity: Utilize modular techniques in the design process.
  1. How can we improve the performance of the Informatica Aggregator Transformation?

To improve the performance of the Informatica Aggregator Transformation, consider the following:

  • Use sorted input to reduce the amount of data cached, improving session performance.
  • Filter unnecessary data before aggregation to reduce the size of the data cache.
  • Connect only the required inputs/outputs to subsequent transformations to reduce the size of the data cache.

Informatica Interview Questions for Experienced Candidates

  1. What are the different lookup caches in Informatica?

The different types of Informatica lookup caches are:

  • Static Cache
  • Dynamic Cache
  • Persistent Cache
  • Shared Cache
  • Reached Cache
  1. What is the difference between static and dynamic caches?
Static Cache Dynamic Cache
Generated once and reused throughout a session. Data is continuously inserted/updated into the cache during the session.
Cannot be inserted or updated during the session, remaining unchanged. Changes as data can be added or updated and passed to the target.
Can handle multiple matches. Cannot handle multiple matches.
Can be used with both flat-file and relational lookups. Can only be used with relational lookups.
Supports relational operators like =, <>, etc. Supports only the = operator.
Can be used in both connected and unconnected lookup transformations. Can only be used for connected lookups.
  1. What is the pmcmd command, and how is it used?

The pmcmd command in Informatica allows you to perform various tasks, such as:

  • Starting workflows
  • Starting a workflow from a specific task
  • Stopping or aborting workflows and sessions
  • Scheduling workflows

Example usages:

sql_more

Start workflow: pmcmd startworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-nameStart workflow from a specific task: pmcmd startask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name -startfrom task-nameStop workflow and task: pmcmd stopworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-namepmcmd stoptask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-nameSchedule workflows: pmcmd scheduleworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-nameAbort workflow and task: pmcmd abortworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-namepmcmd aborttask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name
  1. What do you mean by a mapplet in Informatica?

A mapplet is a reusable object in Informatica that contains a set of transformations, typically created using the Mapplet Designer. It allows you to reuse transformation logic across multiple mappings. There are two types of mapplets:

  • Active Mapplet: Created using an active transformation.
  • Passive Mapplet: Created using a passive transformation.
  1. What is the difference between a Router and a Filter transformation?
Router Transformation Filter Transformation
Rows that don’t meet the conditions are captured in a default output group. Rows that don’t meet the condition are removed from the filter.
Allows records to be divided into multiple groups based on specified conditions. Does not divide records into groups.
Has a single input and multiple output group transformations. Has a single input and a single output group transformation.
Can have more than one condition specified. Can have only a single filter condition specified.
Does not block input rows and failed records. May block records.
  1. Explain tracing levels in Informatica.

Tracing levels in Informatica determine how much data you want to write to the session log as you execute a workflow. Each transformation property window contains an option for setting the tracing level, which aids in error analysis and locating bugs in the process. The different tracing levels are:

  • Terse
  • Normal
  • Terse
  • Verbose
  • Very Verbose
  1. What is the difference between SQL Override and Lookup Override?
Lookup Override SQL Override
Avoids scanning the entire table by limiting the number of lookup rows, saving time and cache. Limits how many rows come into the mapping pipeline.
Applies the “Order By” clause by default. Requires manually adding it to the query.
Supports only one type of join (non-Equi join). Can perform any kind of join by writing the query.
Provides only one record, even if multiple records are found for a single condition. Does not have this limitation.
  1. Write the difference between the stop and abort options in the Workflow Monitor.
Stop Option Abort Option
Executes the session tasks and allows another task to run simultaneously. Fully terminates the currently running task.
Stops the Integration Services from reading data from the source file. Waits for the services to complete before taking any action.
Processes data either to the source or the target. Has a 60-second timeout.
Data can be written and committed to the targets. No indications of such commitment.
Does not kill any processes but stops processes from sharing resources. Ends the DTM (Data Transformation Manager) process and terminates the active session.
  1. Explain what the DTM (Data Transformation Manager) process is.

The PowerCenter Integration Service (PCIS) starts an operating system process called the DTM (Data Transformation Manager) or pmdtm process to run sessions. Its primary role is to create and manage threads responsible for carrying out session tasks, including:

  • Reading session information
  • Forming dynamic partitions
  • Creating partition groups
  • Validating code pages
  • Running processing threads
  • Running post-session operations
  • Sending post-session emails
  1. Describe a workflow and write the components of a Workflow Manager.

A workflow in Informatica is a set of interconnected tasks that need to execute in a specific order or sequence. It represents a business’s internal routine practices, generates output data, and performs routine management tasks.

The Workflow Manager in Informatica provides the following tools to develop workflows:

  • Task Developer: Creates workflow tasks.
  • Worklet Designer: Groups multiple tasks together to form a worklet (an object that combines multiple tasks without scheduling information).
  • Workflow Designer: Creates workflows by connecting tasks and links and can also create tasks within the designer.
  1. What are the different types of tasks in Informatica?

The Workflow Manager allows you to create the following types of tasks to design a workflow:

  • Assignment Task
  • Command Task
  • Control Task
  • Decision Task
  • Email Task
  • Event-Raise Task
  • Event-Wait Task
  • Session Tasks
  • Timer Task
  1. What do you mean by incremental loading in Informatica?

Unlike full data loading, where all data is processed every time, incremental data loading involves loading only selective data (updated or newly created) from the source to the target system. This method provides the following benefits:

  • Reduces ETL process overhead and overall runtime by selectively loading data.
  • Reduces the risk involved, as failed or erroneous ETL load processes are less likely to occur with selective data processing.
  • Preserves data accuracy in the historical record, making it easier to determine the amount of data processed over time.
  1. Explain complex mapping and write its features.

Complex mapping refers to mappings that contain many requirements based on numerous dependencies. Even with just a few transformations, a mapping can be complex if the requirements have many business rules and constraints, including slowly changing dimensions.

Features of complex mapping:

  • Large and complex requirements
  • Complex business logic
  • Several transformations
  1. What is the importance of partitioning a session in Informatica?

Partitioning a session in Informatica is crucial for parallel data processing, which improves the performance of PowerCenter with the Informatica PowerCenter Partitioning Option. By dividing a large data set into smaller parts that can be processed in parallel, overall performance and server efficiency are improved. Partitioning also helps optimize sessions.

  1. What do you mean by the star schema?

The star schema is the simplest data warehouse schema comprising one or more dimensions and one fact table. It is called a “star” schema because of its star-like shape, with radial points (dimension tables) radiating from a center (fact table). The star schema is commonly used to build data warehouses and dimensional data marts.

  1. Explain dimensions in the context of data warehousing.

In a star schema data warehouse, dimension tables contain keys, values, and attributes of dimensions. A dimension table generally contains the descriptive or textual information about the facts contained within a fact table.

Tips to Prepare for Informatica Interviews

  1. Research the Company: Familiarize yourself with Informatica’s evolution, mission, values, goals, culture, products, and areas of focus. Talking about the company’s

TCS pannel video call technical round interview on informatica selected

FAQ

How to prepare for Informatica interview?

Try to work on real-world projects and get hands-on experience with Informatica tools like PowerCenter, Data Quality, and MDM. Brush up on SQL and database concepts: Understanding SQL and database concepts is crucial for Informatica interviews.

What is the salary of TCS Informatica developer?

Average TCS Informatica ETL Developer salary in India is ₹5.9 Lakhs for experience between 2 years to 7 years. Informatica ETL Developer salary at TCS India ranges between ₹2.7 Lakhs to ₹9.0 Lakhs. According to our estimates it is 12% less than the average Informatica ETL Developer Salary in India.

What is the salary of Informatica fresher in ETL?

Job title
Salary
Informatica ETL Developer salaries – 24 salaries reported
₹5,15,000/yr
Informatica ETL Developer salaries – 21 salaries reported
₹5,30,000/yr
Informatica ETL Developer salaries – 10 salaries reported
₹3,48,500/yr

What is Informatica ETL tool?

Informatica is a data integration tool based on ETL architecture. It provides data integration software and services for various businesses, industries and government organizations including telecommunication, health care, financial and insurance services.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *