What is a data warehouse?
A centralized repository used for reporting and analysis.
Define data ingestion.
The process of collecting data from various sources for storage or processing.
What is big data?
Large and complex data sets that require advanced analytics and storage solutions.
Define batch processing.
Processing large volumes of data at scheduled intervals.
Define stream processing.
Real-time data processing as data arrives.
What is structured data?
Data organized in fixed fields such as tables or spreadsheets.
What is unstructured data?
Data without a predefined format such as images or videos.
What is semi-structured data?
Data with some structure but flexible schema, such as JSON or XML.
What is data visualization?
The graphical representation of data to find insights and patterns.
Define metadata.
Data that describes other data.
What is data redundancy?
Storing duplicate copies of data for reliability.
What is a data model?
A design that defines data structure and relationships.
Define data governance.
The management of data availability, usability, and security.
What is a data pipeline?
A set of processes to move and transform data between systems.
What is the purpose of a data catalog?
To organize and manage metadata about data assets.
Define ETL.
Extract, Transform, Load — the process of moving and preparing data.
Define ELT.
Extract, Load, Transform — transformation happens after loading data into storage.
What is the benefit of using cloud data services?
Scalability, flexibility, and cost efficiency.
What is data latency?
The time delay between data creation and availability for use.
Define data analytics.
The process of examining datasets to draw conclusions.
What is a relational database?
A database that stores data in tables with predefined relationships.
What is SQL?
Structured Query Language used to manage and query relational databases.
What is Azure SQL Database?
A fully managed relational database service in Azure.
Define primary key.
A unique identifier for each record in a table.
Define foreign key.
A field linking one table to another.
What is normalization?
The process of reducing data redundancy and improving data integrity.
What is denormalization?
Combining tables to improve read performance.
What is a view?
A virtual table created from a query on one or more tables.
What is a stored procedure?
A precompiled SQL code that can be executed repeatedly.
What is an index in SQL?
A structure that improves data retrieval speed.
What is Azure SQL Managed Instance?
A managed SQL Server instance with near 100% compatibility.
What is Azure Database for PostgreSQL?
A managed PostgreSQL service on Azure.
What is Azure Database for MySQL?
A managed MySQL database service on Azure.
Define ACID properties.
Atomicity, Consistency, Isolation, Durability — ensure reliable transactions.
What is a transaction?
A unit of work performed against a database.
Define data integrity.
Maintaining accuracy and consistency of data.
What is a join in SQL?
A command to combine rows from two or more tables.
What is a clustered index?
An index that determines the physical order of data in a table.
What is a non-clustered index?
An index that stores data separately from the table.
Define referential integrity.
Ensures relationships between tables remain consistent.
What is a constraint?
A rule that defines valid data values in a table.
Define unique constraint.
Ensures all values in a column are different.
What is Azure Synapse SQL pool?
A distributed data warehouse in Azure Synapse Analytics.
What is T-SQL?
Microsoft’s proprietary extension of SQL for Azure SQL and SQL Server.
Define sharding.
Splitting a database into smaller parts for scalability.
Define partitioning.
Dividing data into segments to improve performance and manageability.
What is data replication?
Copying data between systems to ensure availability and durability.
What is a trigger?
A SQL object that executes automatically in response to events.
What is a function in SQL?
A stored program that returns a value.
What is a scalar function?
A function that returns a single value.
What is a table-valued function?
A function that returns a table.
What is data retention?
How long data is stored before being deleted or archived.
What is SQL elasticity?
Automatic scaling of resources based on workload.
What is an execution plan?
A detailed breakdown of how SQL executes a query.
What is Azure SQL Edge?
A SQL database engine optimized for IoT and edge devices.
What is a failover group in Azure SQL?
A mechanism for automatic database failover across regions.
Define connection pooling.
Reusing existing database connections to improve performance.
What is temporal data?
Data that tracks historical changes automatically.
What is PolyBase?
A technology in Synapse that allows querying external data using T-SQL.
What is query optimization?
The process of improving query performance.
What is NoSQL?
A type of database designed for unstructured or semi-structured data.
What is Azure Cosmos DB?
A globally distributed multi-model NoSQL database service.
What are Cosmos DB APIs?
Interfaces like SQL, MongoDB, Cassandra, Gremlin, and Table.
What is partition key in Cosmos DB?
A field that determines data distribution across partitions.
Define throughput in Cosmos DB.
The amount of data operations per second measured in Request Units (RUs).
What is consistency level?
Defines how up-to-date or synchronized replicas are in a database.
Name the five consistency levels in Cosmos DB.
Strong, Bounded Staleness, Session, Consistent Prefix, Eventual.
What is eventual consistency?
Updates are propagated gradually and may be temporarily inconsistent.
What is strong consistency?
Ensures all reads return the most recent committed version.
What is Azure Table Storage?
A NoSQL key-value store for semi-structured data.
What is a document database?
A NoSQL database that stores data in JSON-like documents.
What is a key-value store?
A database where data is stored as key and value pairs.
What is a column-family store?
A database that stores data in columns instead of rows.
What is a graph database?
A NoSQL database optimized for relationships and networks.
What is Azure Blob Storage used for?
Storing unstructured data like text, images, and video.
Define container in Blob Storage.
A logical grouping of blobs, like a folder.
What is object storage?
A storage model that manages data as objects.
What is Azure Data Lake Storage Gen2?
A scalable storage optimized for big data analytics.
What is hierarchical namespace?
Organizes data in folders and directories for efficient access.
What is HDInsight?
A managed Apache Hadoop and Spark service for big data analytics.
What is Azure Databricks?
A fast, collaborative Apache Spark-based analytics platform.
What is an RDD?
Resilient Distributed Dataset, a core concept in Spark.
What is a dataframe in Databricks?
A distributed collection of data organized into named columns.
Define Delta Lake.
An open-source storage layer that brings ACID transactions to data lakes.
What is data sharding in NoSQL?
Splitting data across multiple servers for scalability.
What is data replication in Cosmos DB?
Copying data across regions for fault tolerance.
What is autoscale in Cosmos DB?
Automatically adjusts throughput based on workload.
Define change feed in Cosmos DB.
A log of changes that occur in the database containers.
What is a throughput unit in Cosmos DB?
A measure of performance in Request Units per second (RU/s).
What is geo-replication?
Distributing data across regions for global availability.
What is TTL in Cosmos DB?
Time to Live — automatically deletes items after a set time.
What is an analytical store in Cosmos DB?
A columnar store for running analytics without impacting OLTP workloads.
What is Azure Queue Storage?
A service for storing and retrieving messages asynchronously.
What is a blob tier?
Storage access levels — hot, cool, and archive.
What is Azure File Storage?
A fully managed file share in the cloud.
What is Azure Table API in Cosmos DB?
Allows Azure Table apps to use Cosmos DB backend.
What is JSON used for?
A lightweight format for storing and exchanging data.
What is BSON?
A binary JSON format used by MongoDB.
What is eventual write consistency?
Writes become consistent after some time delay.
What is CAP theorem?
States that a distributed system can only guarantee two of Consistency, Availability, Partition tolerance.
What is Azure Synapse Analytics?
An integrated analytics service combining big data and data warehousing.
What is Azure Data Factory?
A cloud-based ETL service to orchestrate and automate data workflows.
What is Power BI?
A business analytics tool for visualizing and sharing insights.
What is Azure Machine Learning?
A cloud service to build, train, and deploy machine learning models.
Define data transformation.
The process of converting data into a usable format.
What is data wrangling?
Cleaning and preparing data for analysis.
What is Azure Analysis Services?
A fully managed platform for building semantic data models.
What is a data lake?
A centralized repository for storing raw data of all types.
What is the difference between data lake and data warehouse?
Data lake stores raw data; data warehouse stores structured, processed data.
What is OLTP?
Online Transaction Processing — optimized for managing transactions.
What is OLAP?
Online Analytical Processing — optimized for querying and reporting.
What is Azure Data Explorer?
A fast and highly scalable data exploration service.
Define data retention policy.
Rules that define how long data is stored.
What is a data pipeline trigger?
An event that starts a data pipeline in Azure Data Factory.
What is linked service in ADF?
Connection information for data sources.
What is a dataset in ADF?
Represents data structures within a linked service.
What is an activity in ADF?
A processing step within a data pipeline.
Define integration runtime.
A compute infrastructure used by ADF to move and transform data.
What is mapping data flow?
A visual way to design and run data transformations in ADF.
What is Azure Monitor?
A service to collect and analyze performance data from Azure resources.
What is Log Analytics?
A tool to query and analyze log data collected by Azure Monitor.
What is Azure Purview?
A unified data governance and catalog service.
What is Power BI Desktop?
A Windows application for creating Power BI reports.
What is Power BI Service?
An online SaaS platform to share and collaborate on Power BI reports.
What is Power BI Gateway?
A tool to connect on-premises data to Power BI cloud services.
What is DAX in Power BI?
A formula language used for data analysis expressions.
What is a measure in Power BI?
A calculation used for analysis in reports.
What is a data refresh in Power BI?
The process of updating data from source systems.
What is Azure Event Hubs?
A data streaming platform for ingesting and processing event data.
What is Azure Stream Analytics?
A service for analyzing streaming data in real-time.
What is Power Query?
A data connection technology for discovering, connecting, and combining data.
What is a workspace in Power BI?
A shared environment to collaborate on reports and dashboards.
What is row-level security in Power BI?
Restricting data access for users at the row level.
What is incremental refresh in Power BI?
Updating only changed data instead of reloading all data.
Define composite model.
Combines multiple data sources in a single Power BI model.
What is Azure Data Share?
A service to securely share data between organizations.
What is Azure Storage Explorer?
A tool to manage Azure Storage data from desktop.
Define pipeline monitoring in ADF.
Tracking the execution and performance of data pipelines.
What is data lineage?
Tracking the flow of data from source to destination.
What is Azure Monitor Logs?
A centralized log collection and analysis platform.
What is serverless SQL pool?
An on-demand query service in Azure Synapse Analytics.
What is dedicated SQL pool?
A provisioned data warehouse for high-performance analytics.
What is Azure Logic Apps?
A service for automating workflows and integrations.
What is Power Automate?
A tool for automating repetitive business tasks.
What is Azure Advisor?
A service that provides best practice recommendations for Azure resources.
Define resource group.
A logical container for Azure resources.
What is Azure Key Vault?
A cloud service to securely store secrets, keys, and certificates.
What is Azure Data Catalog?
A service to register, discover, and manage data assets.
What is Azure Cost Management?
A tool to monitor and control Azure spending.
What is an Azure region?
A geographical area containing Azure datacenters.
What is Azure availability zone?
Physically separate datacenters within a region for resilience.

Stay up-to-date with the latest technologies trends, IT market, job post & etc with our blogs

Contact Support

Contact us

By continuing, you accept our Terms of Use, our Privacy Policy and that your data.

Join more than1000+ learners worldwide