You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: at-tracker-serverless.md
+4-8
Original file line number
Diff line number
Diff line change
@@ -43,6 +43,9 @@ The following table lists the actions that generate an event:
43
43
|`ibmanalyticsengine.livybatch.create`| Create a Spark application through the Livy batch interface |
44
44
|`ibmanalyticsengine.livybatch.read`| Retrieve the details of a Livy batch application, including the state |
45
45
|`ibmanalyticsengine.livybatch.delete`| Delete a Livy application request or a running Livy batch application |
46
+
|`ibmanalyticsengine.historyserver.start`| Start the Spark history server of an instance |
47
+
|`ibmanalyticsengine.historyserver.stop`| Stop the Spark history server of an instance |
48
+
|`ibmanalyticsengine.historyserver.read`| Retrieve the details of the Spark history server of an instance |
46
49
{: caption="Table 1. Actions that generate management events" caption-side="top"}
47
50
48
51
@@ -51,13 +54,6 @@ The following table lists the actions that generate an event:
51
54
52
55
Events that are generated by an instance of the {{site.data.keyword.iae_full_notm}} service are automatically forwarded to the {{site.data.keyword.at_full_notm}} service instance that is available in the same location.
53
56
54
-
<!--
55
-
Events are available in the following locations:
56
-
* US-South
57
-
* US-East
58
-
* United Kingdom
59
-
* Germany
60
-
* Japan
61
-
* Australia -->
57
+
62
58
63
59
{{site.data.keyword.at_full_notm}} can have only one instance per location. To view events, you must access the web UI of the {{site.data.keyword.at_full_notm}} service in the same location where your service instance is available. For more information, see [Launching the web UI through the IBM Cloud UI](/docs/activity-tracker?topic=activity-tracker-launch){: new_window}.
Copy file name to clipboardExpand all lines: best-practices-serverless.md
+6-6
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,14 @@
1
1
---
2
2
3
3
copyright:
4
-
years: 2017, 2022
5
-
lastupdated: "2022-11-28"
4
+
years: 2017, 2023
5
+
lastupdated: "2023-09-05"
6
6
7
7
subcollection: AnalyticsEngine
8
8
9
9
---
10
10
11
-
<!-- Attribute definitions -->
11
+
12
12
{:new_window: target="_blank"}
13
13
{:shortdesc: .shortdesc}
14
14
{:codeblock: .codeblock}
@@ -28,8 +28,8 @@ Use the following set of recommended guidelines when provisioning and managing y
28
28
| Use separate {{site.data.keyword.iae_full_notm}} service instances for your development and production environments. | This is a general best practice. By creating separate {{site.data.keyword.iae_full_notm}} instances for different environments, you can test any configuration and code changes before applying them on the production instance. | NA |
29
29
| Upgrade to the latest Spark version | As open source Spark versions are released, they are made available in {{site.data.keyword.iae_full_notm}} after a time interval required for internal testing. Watch out for the announcement of a new Spark versions in the Release Notes section and upgrade the runtime of your instance to move your applications to latest Spark runtime. Older runtimes are be deprecated and eventually removed as newer versions are released. Make sure you test your applications on the new runtime before making changes on the production instances. | - [Release notes for {{site.data.keyword.iae_full_notm}} serverless instances](/docs/AnalyticsEngine?topic=AnalyticsEngine-iae-serverless-relnotes)|
30
30
| Grant role-based access | You should grant role-based access to all users on the {{site.data.keyword.iae_full_notm}} instances based on their requirements. For example, only your automation team should have permissions to submit applications because it has access to secrets and your DevOps team should only be able to see the list of all applications and their states. | - [Granting permissions to users](/docs/AnalyticsEngine?topic=AnalyticsEngine-grant-permissions-serverless)|
31
-
| Choose the right {{site.data.keyword.cos_full_notm}} configuration | - **Disaster Recovery (DR) Resiliency**: You should use the {{site.data.keyword.cos_full_notm}} Cross Regional resiliency option that backs up your data across several different cities in a region. In contrast, the Regional resiliency option back ups data in a single data center. \n- **Encryption**: {{site.data.keyword.cos_full_notm}} comes with default built-in encryption. You can also configure {{site.data.keyword.cos_short}} to work with the BYOK Key Protect service. \n- **Service credentials**: By default, {{site.data.keyword.cos_full_notm}} uses IAM-style credentials. If you want to work with AWS-style credentials, you need to use the "Include HMAC Credential" option as described in **Service credentials**. \n- **Direct endpoints for {{site.data.keyword.cos_full_notm}}**: Always use direct endpoints for connectivity to the {{site.data.keyword.cos_full_notm}} instance. This applies to the {{site.data.keyword.cos_full_notm}} home instance as well as endpoints used from your applications (either your code or what you pass as parameters in the configurations at instance level or application level). Direct endpoints provide better performance than public endpoints and do not incur charges for any outgoing or incoming bandwidth. | - **Disaster Recovery (DR) Resiliency**: [{{site.data.keyword.cos_full_notm}} documentation.](/docs/cloud-object-storage/info?topic=cloud-object-storage-endpoints#endpoints) \n- **Encryption**: [Getting started with encryption keys](/docs/key-protect?topic=key-protect-getting-started-tutorial#getting-started-tutorial) and [{{site.data.keyword.cos_short}} manage encryption](/docs/cloud-object-storage/basics?topic=cloud-object-storage-encryption#encryption) \n- **Service credentials**: [Service credentials](/docs/cloud-object-storage/iam?topic=cloud-object-storage-service-credentials#service-credentials) \n- **Direct endpoints for {{site.data.keyword.cos_full_notm}}**: [Endpoints and storage locations](/docs/cloud-object-storage?topic=cloud-object-storage-endpoints) |
32
-
| Use private endpoints for the external Hive metastore | If you are using Spark SQL and want to use an external metastore such as use {{site.data.keyword.databases-for-postgresql_full_notm}} as your Hive metastore, you must use the private endpoint for the database connection for better performance and cost savings. | - [Working with Spark SQL and an external metastore](/docs/AnalyticsEngine?topic=AnalyticsEngine-external-metastore)|
31
+
| Choose the right {{site.data.keyword.cos_full_notm}} configuration | - **Disaster Recovery (DR) Resiliency**: You should use the {{site.data.keyword.cos_full_notm}} Cross Regional resiliency option that backs up your data across several different cities in a region. In contrast, the Regional resiliency option back ups data in a single data center. \n- **Encryption**: {{site.data.keyword.cos_full_notm}} comes with default built-in encryption. You can also configure {{site.data.keyword.cos_short}} to work with the BYOK Key Protect service. \n- **Service credentials**: By default, {{site.data.keyword.cos_full_notm}} uses IAM-style credentials. If you want to work with AWS-style credentials, you need to use the "Include HMAC Credential" option as described in **Service credentials**. \n- **Direct endpoints for {{site.data.keyword.cos_full_notm}}**: Always use direct endpoints for connectivity to the {{site.data.keyword.cos_full_notm}} instance. This applies to the {{site.data.keyword.cos_full_notm}} home instance as well as endpoints used from your applications (either your code or what you pass as parameters in the configurations at instance level or application level). Direct endpoints provide better performance than public endpoints and do not incur charges for any outgoing or incoming bandwidth. | - **Disaster Recovery (DR) Resiliency**: [{{site.data.keyword.cos_full_notm}} documentation.](/docs/cloud-object-storage/info?topic=cloud-object-storage-endpoints#endpoints) \n- **Encryption**: [Getting started with encryption keys](/docs/key-protect?topic=key-protect-getting-started-tutorial#getting-started-tutorial) and [{{site.data.keyword.cos_short}} manage encryption](/docs/cloud-object-storage/basics?topic=cloud-object-storage-encryption#encryption) \n- **Service credentials**: [Service credentials](/docs/cloud-object-storage/iam?topic=cloud-object-storage-service-credentials#service-credentials) \n- **Direct endpoints for {{site.data.keyword.cos_full_notm}}**: [Endpoints and storage locations](/docs/cloud-object-storage?topic=cloud-object-storage-endpoints) |
32
+
| Use private endpoints for the external Hive metastore | If you are using Spark SQL and want to use an external metastore such as use {{site.data.keyword.databases-for-postgresql_full_notm}} as your Hive metastore, you must use the private endpoint for the database connection for better performance and cost savings. | - [Working with Spark SQL and an external metastore](/docs/AnalyticsEngine?topic=AnalyticsEngine-external-metastore)|
33
33
| Running applications with resource overcommitment | There is a quota associated with each Analytics Engine Serverless instance. When applications are submitted on an instance, they are allocated resources from the instance quota. If an application requests resources beyond the available quota, the application will either not start or will run with less than the requested resources, which might result in the application running slower than expected or, in some cases, in the application failing. You should always monitor the current resource consumption on an instance to ensure that your applications are running comfortably within the given limits. You can adjust the limits through a support ticket if required. | - [Default limits and quotas](/docs/AnalyticsEngine?topic=AnalyticsEngine-limits) \n- [Get current resource consumption](/apidocs/ibm-analytics-engine/ibm-analytics-engine-v3#get-current-resource-consumption)|
34
34
| Static allocation of resources versus autoscaling | When you submit applications, you can specify the number of executors upfront (static allocation) or use the autoscaling option (dynamic allocation). Before you decide whether to use static allocation or autoscaling, you might want to run a few benchmarking tests by varying different data sets with both static and autoscaling to find the right configuration. General considerations: \n- If you know the number of resources (cores and memory) required by your application and it doesn't vary across different stages of the application run, it is recommended to allocate static resources for better performance. \n- If you want to go for an optimized resource utilization, you can opt for autoscaling of executors where the executors are allotted based on the application's actual demand. Note that there might be a slight associated delay when using autoscaling in applications. | - [Enabling application autoscaling](/docs/AnalyticsEngine?topic=AnalyticsEngine-appl-auto-scaling) |
35
35
| Enable and fine-tune forward logging | - Enable forward logging for your service instance to help troubleshoot, show progress, and print or show outputs of your applications. Note that log forwarding incurs a cost based on the quantity of logs forwarded or retained in the {{site.data.keyword.la_full_notm}} instance. Based on your use case and need, you need to decide the optimal settings. \n- When you enable log forwarding using the Default API, only the driver logs are enabled. If you need executor logs as well, for example, if there are errors that you would see only on executors, you need to customize logging to enable executor logging as well. Executor logs can become very large, so balance out the options to optimize the amount of logs that get forwarded to your logging instance versus the information you get in logs for troubleshooting purposes. \n- Follow the best practices of {{site.data.keyword.la_full_notm}} when choosing the right configuration and searching techniques. For example, you might want to configure the {{site.data.keyword.la_full_notm}} instance plan for a 7 day search with the archival of logs to {{site.data.keyword.cos_full_notm}} to save on costs. Also refer to the {{site.data.keyword.la_full_notm}} documentation for techniques on searching for logs of your interest based on keywords, point in time, and so on. | - [Configuring and viewing logs](/docs/AnalyticsEngine?topic=AnalyticsEngine-viewing-logs) |
@@ -39,7 +39,7 @@ Use the following set of recommended guidelines when provisioning and managing y
39
39
| Use instances in alternate regions for backup and disaster recovery | Currently, {{site.data.keyword.iae_full_notm}} Serverless instances can be created in two regions, namely Dallas(`us-south`) and Frankfurt(`eu-de`). Although it is advisable to create your instances in the same region where your data is located, it is always useful to create a backup instance in an alternate region with the same set of configurations as your primary instance, in case the primary instance becomes unavailable or unusable. Your automations should enable switching application submissions between the two regions if required.| NA |
40
40
| Use separate buckets and service credentials for application files, data files, and home instance | Use the "separation of concerns" principle to distinguish the access between different resources. \n - Do not store data or application files in the home instance bucket. \n - Use separate buckets for data and application files. \n - Use separate access credentials (IAM Key based) with restricted access to the bucket for application files and the bucket that contains your data. | - [Assigning access to an individual bucket](/docs/cloud-object-storage?topic=cloud-object-storage-iam-bucket-permissions&interface=ui)|
41
41
| Applications must run within 72 hours | There is a limit on the number of hours an application or kernel can run. For security and compliance patching, all runtimes that run for more than 72 hours are stopped. If you do have a large application, break your application into smaller chunks that will run within 72 hours. If you are running Spark streaming applications, make sure that you configure checkpoints and have monitoring in place to restart your applications if they are stopped | - [Application limits](/docs/AnalyticsEngine?topic=AnalyticsEngine-limits#limits-application)|
42
+
|Start and Stop Spark History only when needed | Always stop the Spark history server when you no longer need to use it. Keep in mind that the Spark history server consumes CPU and memory resources continuously while its state is started. | - [Spark history server](/docs/AnalyticsEngine?topic=AnalyticsEngine-spark-history-serverless)|
42
43
{: caption="Table 1. Best practices when using serverless instances including detailed descriptions and reference links" caption-side="top"}
0 commit comments