sas di studio interview questions

Top SAS DI Interview Questions And Answers
  • Q1) Explain Data Dimension? …
  • Q2) Explain Data Access? …
  • Q3) Define Data Governance? …
  • Q4) What is Exception reporting in DI? …
  • Q5) Explain multi-dimensional reporting? …
  • Q6) What is meant by dimension tables in DI? …
  • Q7) Explain the data staging area in DI?

SAS Data Integration Studio | SAS BI (Business Intelligence) Certification Training | Uplatz

Course Offer Flash Sale – Upto 15% Off + 20% Cashback Course Free | OFFER ENDING IN :

sas di studio interview questions

sas di studio interview questions

SAS Di is basically SAS Data Integration studio , a tool like base sas editor for coding and reporting purpose. You can call it a drag and drop tool which is equipped with many inbuilt functions, macros , transformation, loaders etc.

1) What is Data Integration?

Ans:

  • The process of combining data from different resources.
  • The combined data is provided to the users with unified view.
  • Information from different enterprise domains are integrated – known as Enterprise Information Integration.
  • Useful for merging information from different technologies among enterprises.
  • 2) What is transformation in SAS data integration?

    Ans:

    It is a metadata object which determines how to extract data, transform data and load data into data stores.

    3) What is the difference between unique key and primary key?

    Ans:

    Unique key is one or more columns that can be used to uniquely identify a row in a table. A table can have one or more unique keys. Unique keys can contain null values. While on the other hand table can have only one primary key. One or more columns in a primary key cannot contain null values.

    4) Explain about Pivot – Columns to Rows?

    Ans:

  • Data Integrator produces a row in the output data set for every value in the designated pivot column.
  • More than one pivot column can be set as per the need of application’s data integration.
  • Pivot Sequence Column – Data Integrator increments a sequence number for every row created from a pivot column.
  • Non-Pivot column – The columns that need to appear in the target.
  • Pivot Set – A group of pivot columns, unique data field and header column.
  • Data Field Column – It contains the pivot data along with pivot columns values.
  • Header Column – Lists the name of the columns.
  • 5) What are the benefits of data integration?

    Ans:

    Following are the benefits of data integration:

  • -Makes reporting, monitoring, placing customer information across the enterprise flexible and convenient.
  • -Risk adjusted profitability management as it allows accurate data extraction.
  • -Allows timely and reliable reporting, as data quality is the prime technology for business challenges.
  • 6) Describe how to adjust the performance of Data Integrator?

    Ans:

    Following are the ways to perform this:

  • Setting target-based options to optimize the performance.
  • 7) What do you mean by data staging area?

    Ans:

    Staging area of the data warehouse is both the storage area and set of process commonly referred as extract transformation load. The data staging area is everything between the operational source systems and the data presentation area.

    8) What is data governance?

    Ans:

    It is the robust, reliable, repeatable and controlled process both at point of input and through subsequent downstream control checks. This process exists to manage updates of business rules to maintain a level of consistency.

    9) What is data access?

    Ans:

    It is the access by selected business users to raw (untransformed) data loads.

    10) What is slowly changing dimension?

    Ans:

    This is the technique for tracking changes to dimensional table values in order to analyze trends. For example, a dimension table named customers might have columns for customer id, home address and income. Each time the address or income changes for a customer, a new row could be created for that customer in the dimensional table and old row could be retained.

    11) What is snow flake schema?

    Ans:

    Snow flake schema is defined in which a single fact table is connected to multiple dimension tables. The dimension are structured to minimize update anomalies and to address single themes.

    12) How can we minimize the space requirement of a huge data set in SAS for window?

    Ans:

    When we are working with large data sets, we can do the following steps to reduce space requirements:

  • Split the huge data set into smaller data sets
  • Clean up our working space as much as possible at each step
  • Use data set options (keep= or drop=) or statements (keep or drop) to limit to only the variables needed
  • Use IF statement or OBS= to limit the number of observations
  • Use WHERE= or WHERE or index to optimize the WHERE expression to limit the number of observations in a PROC Step and a DATA Step
  • Use length to limit the bytes of variables
  • Use a _null_ data set name when we don’t need to create a data set
  • Compress the data set using system options or data set options (COMPRESS=yes or COMPRESS=binary)
  • 13) What is star schema?

    Ans:

    Star schema is defined as database in which single fact table is connected to multiple dimension tables. This is represented in a star schema.

    14) What is SAS application server, database server, SAS OLAP server and SAS metadata server?

    Ans:

    SAS application server provides SAS services to a client. On the other hand database server provides relational database service to a client. Oracle, DB2, and Teradata are examples of relational databases. SAS OLAP server provides access to multidimensional data. SAS metadata server provides metadata management services to one or more client application.

    15) What is operational data and operational system?

    Ans:

    Operational data is used as source data for a data warehouse. While operational system is one or more programs that provide source data for a data warehouse.

    16) What Is Change Analysis In Sas Di?

    Ans:

    Change analysis is the process of comparing one set of metadata to another set of metadata and identifying the differences between the two sets of metadata.

    17) What Is The Use Of Sas Management Console?

    Ans:

    SAS management console application provides a single user interface for performing SAS administrative tasks.

    18) Name Some Data Transformation Used In Sas Di?

    Ans:

    Types of the data transformation are append, apply lookup standardization, create match code transformation, data transfer, data validation, extract, fact table lookup, key effective data transformation, lookup, SAS rank, SAS sort, SAS splitter, SCD type 2 loader, SQL join, standardize transformation, Surrogate key generator , Transpose transformation, User written code transformation.

    19) Describe About Metadata Object?

    Ans:

    It is a set of attributes that describe a table, a server, a user and another resource on a network.

    20) Name The Scheduler For Scheduling Job And Explain The Scheduler?

    Ans:

    The scheduler used for scheduling job is control m while CONTROL-m also user to view process flow and dependencies so that they can optimize business processes easily and efficiently, even in a data center that includes multiple platform types (for example, Unix, Microsoft Windows, and MVS)

    21) What Is Change Analysis In Sas Di ?

    Ans:

    Change analysis is the process of comparing one set of metadata to another set of metadata and identifying the differences between the two sets of metadata.

    22) Describe The Interaction Table In Sas Di?

    Ans:

    Table that describes the relationships between two or more tables. For example, an intersection table could describe the many-to-many relationships between a table of users and a table of groups.

    23) What Are The Prime Responsibilities Of Data Integration Administrator?

    Ans:

  • Scheduling and executing the batch jobs.
  • Configuring, starting and stopping the real-time services
  • Adapters configuration and managing them.
  • Repository usage, Job Server configuration.
  • Access Server configuration.
  • Batch job publishing.
  • Real-time services publishing through web services.
  • 24) Explain The Difference Between Alternate Key, Business Key, Foreign Key , Generated Key , Primary Key, Retained Key And Surrogate Key ?

    Ans:

  • Alternate key is term also known as unique key.
  • Business key is one or more columns in a dimension table that comprise the primary key in a source table in an operational system.
  • Foreign key is one or more columns that are associated with a primary key or unique key in another table. A table can have one or more foreign keys. A foreign key is dependent upon its associated primary or unique key. In other words, a foreign key cannot exist without that primary or unique key.
  • Generated keys is used to implement surrogate keys and retained keys, one or more columns that are used to uniquely identify a row in a table. A table can have only one primary key. One or more columns in a primary key cannot contain null values. Retained key is a numeric column in a dimension table that is the primary key of that table.
  • Surrogate key is a column which contains unique integer values that are generated sequentially when rows are added and updated. In the associated fact table, the surrogate key is included as a foreign key in order to connect to specific dimensions.
  • 25) Explain About Data Integrator Metadata Reports?

    Ans:

  • Browser-based analysis and reporting capabilities are provided by Metadata reports.
  • The DI Metadata Reports are generated on metadata that associates with Data Integration jobs.
  • Other BO applications those are associated with Data Integration.
  • Three modules are provided by Metadata Reports. They are:

  • Operational Dashboards.
  • Auto Documentation.
  • Impact and Lineage analysis.
  • 26) Explain About Various Caches Available In Data Integrator?

    Ans:

  • NO_CACHE – It is used for not caching values.
  • PRE_LOAD_CACHE – Result column preloads and compares the column into the memory, prior to executing the lookup.
  • PRE_LOAD_CACHE is used when the table can exactly fit in the memory space.
  • DEMAND_LOAD_CACHE – Result column loads and compares the column into the memory when a function performs the execution.
  • DEMAND_LOAD_CACHE is suitable while looking up the highly repetitive values with small subset of data.
  • 27) What Is Hierarchy Flattening?

    Ans:

  • Construction of parent/child relationships hierarchy is known as Hierarchy Flattening.
  • A description of hierarchy in the vertical or horizontal format is produced.
  • The hierarchy pattern includes Parent column, Child Column, Parent Attributes and Child Attributes.
  • Hierarchy Flattening allows to understand the basic hierarchy of BI in a lucid manner.
  • As the flattening is done in horizontal or vertical format, the sub elements are easily identified.
  • 28) Is Data Integration And Etl Programming Is Same?

    Ans:

  • No, Data Integration and ETL programming are different.
  • Passing of data to different systems from other systems is known as data integration.
  • It may integrate data within the same application.
  • ETL, on the other hand, is to extract the data from different sources.
  • The primary ETL tool job is to transform the data and loads into other objects or tables.
  • 29) Describe About Physical Data Integration?

    Ans:

    Physical Data Integration is all about creating new system that replicates data from the source systems. This process is done to manage the data independent of the original system. Data Warehouse is the example of Physical Data Integration. The benefits of PDI include data version management, combination of data from various sources, like mainframes, flat files, databases.

    30) Why Is Sas Data Integration Studio Important?

    Ans:

    Companies are realizing that in order to succeed they need an integrated view of their data and SAS Data Integration Studio is the single tool that provides the flexibility, reliability and agility needed to respond to new data integration challenges. Regardless of the project, SAS Data Integration Studio users can respond with speed and efficiency, reducing the overall cost of data integration.

    sas di studio interview questions

    Job can be deploy as a SAS stored process in SAS Data Integration Studio. Code is generated for the stored process and the code is saved to a file in a source repository. Metadata about the stored process is saved to the current metadata server. The stored process can be executed as required by requesting applications. Stored processes can be used for Web reporting, analytics, building Web applications, delivering result packages to clients or the middle tier, and publishing results to channels or repositories. Stored processes can also access any SAS data source or external file and create new data sets, files, or other data targets supported by the SAS System.

    The process of combining data from different resources. The combined data is provided to the users with unified view. Information from different enterprise domains are integrated – known as Enterprise Information Integration. Useful for merging information from different technologies among enterprises. The sub areas of data integration are Data Warehousing. Data Migration. Master Data Management.

    It is the definition of the customer, product and organization data to be held in central location and to be accessed by all the with governance around change. Common set of dimension are required to support all business views of data and self service reporting. Business view has the capability to segment customer, product and organization data across any dimension.

    Dimension table are integral companion to a fact table. It contains the textual descriptions of the business. In a well designed dimensional model, dimension tables have many columns or attributes. These attributes describe the row in the dimensional table. Dimension attributes serve as the primary source of query constraints, groupings and report labels.

    The generated consolidated data need not require separate storage space. Data history and version management is limited and applied only to the similar type of data. Accessing to the user data overloads on the source systems. UDAI places the data in the source systems. A set of views are defined for providing access the unified view to the clients / customers. Zero latency of data can be propagated from the source system.

    40. What is the difference between CLASS statement and BY statement in proc means?

    Answer:

  • Unlike CLASS processing, BY processing requires that your data already be sorted or
  • indexed in the order of the BY variables.

  • BY group results have a layout that is different from the layout of CLASS group results.
  • FAQ

    How do I prepare for a SAS interview?

    SAS Interview Questions
    1. List down the reasons for choosing SAS over other data analytics tools. …
    2. What is SAS? …
    3. What are the features of SAS? …
    4. Mention few capabilities of SAS Framework. …
    5. What is the function of output statement in a SAS Program? …
    6. What is the function of Stop statement in a SAS Program?

    What is SAS DI Studio?

    SAS Data Integration Studio is a visual design tool for building, implementing, and managing data integration processes regardless of data sources, applications, or platforms.

    What is the use of SAS Di?

    SAS Interview Questions for Experienced
    • What do you mean by SAS Macros and why to use them? …
    • Write different ways to create micro variables in SAS Programming? …
    • Explain how %Let and macro parameters can be used to create micro variables in SAS programming? …
    • Name some SAS system options that are used to debug SAS Micros.

    Related Posts

    Leave a Reply

    Your email address will not be published. Required fields are marked *