Looking for ETL interview questions? Here is the list of most important questions that can be asked in interviews. Make yourself ready with this comprehensive list of interview questions.
ETL Interview Questions:
- The models such as budgeting, financial reporting, allocations, etc. are heavy on calculations and do they not translate well into SQL. Among Multi-dimensional, Real Time and Relational OLAP, Which type of OLAP is not suitable when computations have to be carried out on these models and why?
- What do you know about Operational data store in data warehousing?
- How is a distributed data warehouse different from a virtual data warehouse?
- Do you know about a data cube? What is the significance of data cubes?
- You have a product dimension and a sales fact table in your data warehouse. A new product is created in the OLTP system and sales transactions happen for that product and then you extract the OLTP system. Due to this you get only the sales transaction into the staging environment and not the products. Why did this happen?
- Can you explain me the difference between parametric models and nonparametric models of predicting? Also given one example of each.
- Shown is a Graph Database model skeleton. Consider N be the nodes (representing entity), E be the edges (represents relationships among entities). How do you represent entity attributes, plus metadata in such data model?
- What can you say about any column if a database has high cardinality value?
- Among fact table and a dimension table which one remains in denormalized form?
- A business model has employee details, company details, and salary details. The model (schema) has only one foreign key i.e. Employee_ID for all the dimensions which are necessary. What type of model is being referred to here?
- Do you know about association rule mining? Explain with one of its practical use case.
- Suppose i1, i2, i3, i4, i5, i6, i7, i8 are the items sold at a grocery store. Customers can buy any combination of the above items as per their requirements. After collecting the data from all transaction it was found that i5, i6, i7 is the most frequently bought itemset. While i2, i4, i5 is bought least frequently. What can you say about the purchase behavior of i5i6 itemset and i2i4i5i7 itemset?
- What do you think is the primary role of a data mining engine in any data mining system?
- I give you the below ways that explain working of any typical data mining process?
i) Perform ETL onto the data warehouse system. ii) Provide data access to analysts iii) Store and manage the data in a multidimensional DB system. iv)Analysis and Presentation of data
Can you rearrange these steps for a true data mining process?
- Consider data mining techniques like Clustering, Associations, Link analysis, and Deviation detection, which one would you use for finding a relationship between two entities?
- Suppose you're asked to reduce an existing data model using IBM DataStage ETL (Assume you are not using Parallel Extender). Can you do this as percentage wise reductions on the rows in the data source? If yes, how? If no, what is the workaround?
- Among Hash file and Sequential file, which file is faster to search data for and why?
- Why is the sparse lookup in DataStage faster than Normal lookup?
- You have two-time data type field values in a DataStage transformer. In one record the value is 00:59:59 while in the other record the value is 00:10:10. If you want to add these two values so as to get the output as 01:09:09, how would you do that?
- Can you explain me about merge stage in DataStage in the context of jobs? Also, explain how is merge performed and what is mandatory when the merge is performed on a key field?
Use this list of ETL Interview Questions for interview preparation.