You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/data-engineering/create-custom-spark-pools.md
+18-11Lines changed: 18 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,17 +6,21 @@ ms.author: eur
6
6
author: eric-urban
7
7
ms.topic: how-to
8
8
ms.custom:
9
-
ms.date: 07/03/2025
9
+
ms.date: 09/22/2025
10
10
---
11
11
12
12
# How to create custom Spark pools in Microsoft Fabric
13
13
14
-
In this document, we explain how to create custom Apache Spark pools in Microsoft Fabric for your analytics workloads. Apache Spark pools enable users to create tailored compute environments based on their specific requirements, ensuring optimal performance and resource utilization.
14
+
This article shows you how to create custom Apache Spark pools in Microsoft Fabric for your analytics workloads. Apache Spark pools let you create tailored compute environments based on your requirements, so you get optimal performance and resource use.
15
15
16
-
You specify the minimum and maximum nodes for autoscaling. Based on those values, the system dynamically acquires and retires nodes as the job's compute requirements change, which results in efficient scaling and improved performance. The dynamic allocation of executors in Spark pools also alleviates the need for manual executor configuration. Instead, the system adjusts the number of executors depending on the data volume and job-level compute needs. This process enables you to focus on your workloads without worrying about performance optimization and resource management.
16
+
Specify the minimum and maximum nodes for autoscaling. The system gets and retires nodes as your job's compute needs change, so scaling is efficient and performance improves. Spark pools adjust the number of executors automatically, so you don't need to set them manually. The system changes executor counts based on data volume and job compute needs, so you can focus on your workloads instead of performance tuning and resource management.
17
17
18
-
> [!NOTE]
19
-
> To create a custom Spark pool, you need admin access to the workspace. The capacity admin must enable the **Customized workspace pools** option in the **Spark Compute** section of **Capacity Admin settings**. To learn more, see [Spark Compute Settings for Fabric Capacities](capacity-settings-management.md).
18
+
> [!TIP]
19
+
> When you configure Spark pools, node size is determined by **Capacity Units (CU)**, which represent the compute capacity assigned to each node. For more information about node sizes and CU, see [Node size options](#node-size-options) section in this guide.
20
+
21
+
## Prerequisites
22
+
23
+
To create a custom Spark pool, make sure you have admin access to the workspace. The capacity admin enables the **Customized workspace pools** option in the **Spark Compute** section of **Capacity Admin settings**. For more information, see [Spark Compute Settings for Fabric Capacities](capacity-settings-management.md).
20
24
21
25
## Create custom Spark pools
22
26
@@ -44,14 +48,17 @@ These custom pools have a default autopause duration of 2 minutes. Once the auto
44
48
45
49
## Node size options
46
50
47
-
When configuring a custom Spark pool, you can choose from the following different node sizes:
51
+
When you set up a custom Spark pool, you choose from the following node sizes:
| Small | 4 | 32 | Lightweight development and testing jobs. |
52
-
| Medium | 8 | 64 | Most general workloads and typical operations. |
53
-
| Large | 16 | 128 | Memory-intensive tasks or larger data processing jobs. |
54
-
| X-Large | 32 | 256 | The most demanding Spark workloads requiring significant resources. |
55
+
| Small | 4 | 32 | For lightweight development and testing jobs. |
56
+
| Medium | 8 | 64 | For general workloads and typical operations. |
57
+
| Large | 16 | 128 | For memory-intensive tasks or large data processing jobs. |
58
+
| X-Large | 32 | 256 | For the most demanding Spark workloads that need significant resources. |
59
+
60
+
> [!NOTE]
61
+
> A capacity unit (CU) in Microsoft Fabric Spark pools represents the compute capacity assigned to each node, not the actual consumption. Capacity units differ from VCore (Virtual Core), which is used in SQL-based Azure resources. CU is the standard term for Spark pools in Fabric, while VCore is more common for SQL pools. When sizing nodes, use CU to determine the assigned capacity for your Spark workloads.
Workspace roles define what users can do with Microsoft Fabric items. Roles can be assigned to individuals or security groups from workspace view. See,[Give users access to workspaces](../fundamentals/give-access-workspaces.md).
15
+
Workspace roles define what users can do with Microsoft Fabric items. Roles can be assigned to individuals or security groups from workspace view. For more information about how to manage workspace roles, see[Give users access to workspaces](../fundamentals/give-access-workspaces.md).
16
16
17
-
The user can be assigned to the following roles:
17
+
## Lakehouse workspace roles and item-specific functions
18
+
19
+
A user can be assigned to the following roles:
18
20
19
21
* Admin
20
22
* Member
21
23
* Contributor
22
24
* Viewer
23
25
24
-
In a lakehouse, the users with Admin, Member, and Contributor roles can perform all CRUD operations on all data. A user with Viewer role can only read data stored in Tables using the [SQL analytics endpoint](lakehouse-sql-analytics-endpoint.md).
26
+
In a lakehouse, the users with *Admin*, *Member*, and *Contributor* roles can perform all CRUD (create, read, update, and delete) operations on all data. A user with the *Viewer* role can only read data stored in tables using the [SQL analytics endpoint](lakehouse-sql-analytics-endpoint.md).
25
27
26
28
> [!IMPORTANT]
27
-
> When accessing data using the SQL analytics endpoint with Viewer role, **make sure SQL access policy is granted to read required tables**.
29
+
> When accessing data using the SQL analytics endpoint with *Viewer* role, make sure the SQL access policy is granted to read required tables.
30
+
31
+
The following matrix shows which actions each workspace role can perform on lakehouse items:
32
+
33
+
| Role | Create | Read | Update | Delete |
34
+
|-------------|:------:|:----:|:------:|:------:|
35
+
| Admin | ✔ | ✔ | ✔ | ✔ |
36
+
| Member | ✔ | ✔ | ✔ | ✔ |
37
+
| Contributor | ✔ | ✔ | ✔ | ✔ |
38
+
| Viewer || ✔<sup>1</sup> |||
39
+
40
+
<sup>1</sup> Viewer can only read data stored in tables using the SQL analytics endpoint provided SQL access policy is granted.
0 commit comments