Looks like no one added any tags here yet for you.
Apache Spark
A powerful open-source distributed computing library that enables large-scale data processing and analytics tasks.
Delta Lake
An open-source storage layer that brings ACID transactions and enhances data reliability for Apache Spark.
Native Execution Engine
An enhancement for Apache Spark workloads that executes Spark queries directly on lakehouse infrastructure, improving performance.
Fabric Runtime
An Azure-integrated platform based on Apache Spark that facilitates data engineering and data science experiences.
Vendor Lock-in
A situation where customers cannot easily switch suppliers due to compatibility or cost considerations.
ETL
Extract, Transform, Load; a process of moving data from one system to another after transforming it into a standard format.
ACID Transactions
A set of properties (Atomicity, Consistency, Isolation, Durability) that guarantee database transactions are processed reliably.
Parquet
A columnar storage file format optimized for use with big data processing frameworks.
TPC-DS Benchmark
A benchmark designed to measure the performance of decision support systems.
Data Lakehouse
A modern architectural pattern that combines the best features of data lakes and data warehouses.
Auto-scaling
The capability of a system to automatically adjust its resource allocation based on the current workload.
Job Admission Logic
Rules that govern how jobs are queued and executed within a computing environment.
Interactive Queries
Queries that allow users to interact with data in real time, often used for data exploration.
Runtime Version
The specific version of software that executes a program, which may affect compatibility and performance.
High concurrency mode
A feature that allows multiple users to share the same computation resources to optimize performance.
Structured Streaming
A stream processing engine built on the Spark SQL engine that enables scalable and fault-tolerant stream processing.
Custom Spark Pools
User-defined configurations for clusters in Apache Spark environments to manage specific workloads.
Migration Scenarios
Procedures and considerations for transitioning from one version of software to another.
Lakehouse Architecture
Combines data lakes and data warehouses into a unified architecture that supports both storage and analytics.
Open-source
Software with source code that is made available to the public for use, modification, and distribution.
Job Definition
A set of configurations and scripts that specify how a Spark job should be executed.
Library Management
The process of handling software libraries within a computing environment to ensure compatibility and availability.
Workspace Settings
Configuration options that control the parameters and resources available to a user within a software platform.
Experimental Preview
A phase of a software release where new features are tested by users before general availability.