1.1.3 Environment Level Spark Properties

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/9

flashcard set

Earn XP

Description and Tags

Last updated 4:34 PM on 3/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

10 Terms

1
New cards

What Spark properties can be configured at the Environment level?

spark.executor.memory, spark.executor.cores, spark.driver.memory, spark.sql.shuffle.partitions, spark.sql.adaptive.enabled, and other Spark config keys.

2
New cards

How do you set spark.executor.memory in Fabric?

In the Environment's Spark Properties section, add the property key spark.executor.memory with the desired value (e.g., 8g). Or use %%configure in a notebook.

3
New cards

What is the %%configure magic command?

A notebook magic command that sets Spark session configuration before the session starts. It overrides Environment-level Spark properties for that session only.

4
New cards

What is the default number of shuffle partitions in Spark?

200 (spark.sql.shuffle.partitions). This can be tuned based on data volume; fewer for small data, more for large data.

5
New cards

How does spark.executor.cores affect Spark performance?

It determines how many tasks each executor runs in parallel. More cores per executor means more parallelism but requires more memory per executor.

6
New cards

What is Adaptive Query Execution (AQE) in Spark?

A Spark feature (spark.sql.adaptive.enabled) that dynamically optimizes query plans at runtime, including coalescing shuffle partitions and handling data skew.

7
New cards

How do Environment Spark properties interact with workspace-level settings?

Environment-level settings override workspace defaults. The precedence is: session (%%configure) > Environment > Workspace defaults.

8
New cards

What is spark.driver.memory and why does it matter?

Memory allocated to the Spark driver process which coordinates job execution. Too little can cause OutOfMemory errors when collecting large results or broadcasting data.

9
New cards

Can you set Spark properties dynamically based on the workload?

Yes, using %%configure in notebooks or by parameterizing pipeline notebook activities to pass different Spark config values.

10
New cards

What does spark.sql.files.maxPartitionBytes control?

The maximum size of a partition when reading files. Default is 128MB. Tuning it affects the number of tasks created when reading large files.

Explore top notes

Explore top flashcards

flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)
flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)