Microsoft Fabric Runtime: An Azure-integrated platform based on Apache Spark.
Combines internal and open-source components to enable data engineering and data science experiences.
Often referred to simply as Fabric Runtime.
Apache Spark: An open-source distributed computing library designed for large-scale data processing and analytics.
Delta Lake: An open-source storage layer that adds ACID transactions and data reliability features to Apache Spark.
Native Execution Engine: Enhances performance by executing Spark queries directly on lakehouse infrastructure, boasting up to 4x faster query speeds compared to traditional OSS Spark.
Compatible with Parquet and Delta formats supported in Runtime 1.3.
Built on Meta's Velox and Intel's Apache Gluten.
Default-level Packages: Includes Java/Scala, Python, and R packages that are pre-installed for ease of use.
Always use the most recent General Availability (GA) runtime version.
Version | Apache Spark | Operating System | Java | Scala | Python | Delta Lake | R |
---|---|---|---|---|---|---|---|
Runtime 1.1 | 3.3.1 | Ubuntu 18.04 | 8 | 2.12.15 | 3.10 | 2.2.0 | 4.2.2 |
Runtime 1.2 | 3.4.1 | Mariner 2.0 | 11 | 2.12.17 | 3.10 | 2.4.0 | 4.2.2 |
Runtime 1.3 | 3.5.0 | Mariner 2.0 | 11 | 2.12.17 | 3.11 | 3.2 | 4.4.1 |
Check links for detailed features and migration scenarios for each runtime.
Incorporates optimizations specifically for Spark and Delta Lake, designed for native integration within Fabric.
Nearly 100 built-in query performance enhancements.
Optimized writing processes for better performance.
Default V-Order optimization for Delta Parquet files enhances read performance.
Users can switch between multiple runtimes without risk of disruption.
Changing the runtime version affects all system-created items in the workspace, with specific guidance on doing so.
Aim to migrate Spark settings; warnings issued for incompatible settings.
Configuration settings differ between mutable and immutable settings.
Python and R libraries generally operate without issues when versions are unchanged.
Jars may face compatibility issues due to dependency changes—users must address conflicts with their libraries.
Delta Lake features are backward compatible, but forward compatibility may be compromised when certain features are enabled.
Use method delta.upgradeTableProtocol to upgrade Delta table protocols with caution.
Runtime 1.3 changes default table format from Parquet to Delta in various Spark commands.
Scripts assuming Parquet should be revised as Delta is now the default.
Provide product feedback and engage with community for further questions.
Minor versions of Apache Spark are released every 6 to 9 months.
Microsoft Fabric Spark team rapidly delivers new runtime versions.
Each runtime has distinct support phases including Experimental, Public Preview, GA, LTS, and End of Support.
Once a runtime's end-of-support date arrives, the runtime is removed and will not receive any updates.
Runtime major versions correspond to Apache Spark's major version.
Latest GA version introduces Apache Spark 3.5 with numerous performance enhancements.
Compatibility upgrades, new features for structured streaming, and expanded functionality in PySpark.
Improvements for Delta Lake 3.2 focused on performance and ease of use.
Maintains a range of updates for improved performance and patches.
Various fixes and enhancements including improvements in stability.
New algorithms minimize data loss issues common with parallel insert operations.
Overview of Delta Lake capabilities reinforcing its use in lakehouse architecture.
Encourage users to provide feedback and engage with the community.
Covers Advanced settings, REST API usage, Git integration, Spark job definitions, and administration for enhanced user experience in Microsoft Fabric.