Skip to content

Commit 0bfd303

Browse files
authored
Add descriptions (#641)
1 parent de29735 commit 0bfd303

12 files changed

+91
-44
lines changed

docs/modules/trino/pages/concepts.adoc

+14-4
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,18 @@
11
= Concepts
2+
:description: Trino connects to diverse data sources via connectors and catalogs, enabling efficient distributed queries across multiple data stores.
3+
:what-trino-is: https://trino.io/docs/current/overview/use-cases.html#what-trino-is
4+
:trino-connector: https://trino.io/docs/current/connector.html
25

36
== [[connectors]]Connectors
47

5-
https://trino.io/docs/current/overview/use-cases.html#what-trino-is[Trino] is a tool designed to efficiently query vast amounts of data using distributed queries. It is not a database with its own storage but rather interacts with many different data stores. Trino connects to these data stores - or data sources - via https://trino.io/docs/current/connector.html[connectors].
8+
{what-trino-is}[Trino] is a tool designed to efficiently query vast amounts of data using distributed queries.
9+
It is not a database with its own storage but rather interacts with many different data stores.
10+
Trino connects to these data stores - or data sources - via {trino-connector}[connectors].
611
Each connector enables access to a specific underlying data source such as a Hive warehouse, a PostgreSQL database or a Druid instance.
712

8-
A Trino cluster comprises two roles: the Coordinator, responsible for managing and monitoring work loads, and the Worker, which is responsible for executing specific tasks that together make up a work load. The workers fetch data from the connectors, execute tasks and share intermediate results. The coordinator collects and consolidates these results for the end-user.
13+
A Trino cluster comprises two roles: the Coordinator, responsible for managing and monitoring work loads, and the Worker, which is responsible for executing specific tasks that together make up a work load.
14+
The workers fetch data from the connectors, execute tasks and share intermediate results.
15+
The coordinator collects and consolidates these results for the end-user.
916

1017
== [[catalogs]]Catalogs
1118

@@ -24,9 +31,12 @@ Currently, the following connectors are supported:
2431

2532
== Catalog references
2633

27-
Within Stackable a `TrinoCatalog` consists of one or more (mandatory or optional) components which are specific to that catalog. A catalog should be re-usable within multiple Trino clusters. Catalogs are referenced by Trino clusters with labels and label selectors: this is consistent with the Kubernetes paradigm and keeps the definitions simple and flexible.
34+
Within Stackable a `TrinoCatalog` consists of one or more (mandatory or optional) components which are specific to that catalog.
35+
A catalog should be re-usable within multiple Trino clusters.
36+
Catalogs are referenced by Trino clusters with labels and label selectors: this is consistent with the Kubernetes paradigm and keeps the definitions simple and flexible.
2837

29-
The following diagram illustrates this. Two Trino catalogs - each an instance of a particular connector - are declared with labels that used to match them to a Trino cluster:
38+
The following diagram illustrates this.
39+
Two Trino catalogs - each an instance of a particular connector - are declared with labels that used to match them to a Trino cluster:
3040

3141
image::catalogs.drawio.svg[A TrinoCluster referencing two catalogs by label matching]
3242

docs/modules/trino/pages/getting_started/first_steps.adoc

+13-5
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,13 @@
11
= First steps
2+
:description: Deploy and verify a Trino cluster with Stackable Operator. Access via CLI or web interface, and clean up after testing.
23

3-
After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you will now deploy a Trino cluster and the required dependencies. Afterwards you can <<_verify_that_it_works, verify that it works>> by running some queries against Trino or visit the Trino web interface.
4+
After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you will now deploy a Trino cluster and the required dependencies.
5+
Afterwards you can <<_verify_that_it_works, verify that it works>> by running some queries against Trino or visit the Trino web interface.
46

57
== Setup Trino
68

7-
A working Trino cluster and its web interface require only the commons, secret and listener operators to work. Simple tests are possible without an external data source (e.g. PostgreSQL, Hive or S3), as internal data can be used.
9+
A working Trino cluster and its web interface require only the commons, secret and listener operators to work.
10+
Simple tests are possible without an external data source (e.g. PostgreSQL, Hive or S3), as internal data can be used.
811

912
Create a file named `trino.yaml` with the following content:
1013

@@ -54,7 +57,9 @@ include::example$getting_started/code/getting_started.sh[tag=port-forwarding]
5457

5558
=== Access the Trino cluster via CLI tool
5659

57-
We use the https://trino.io/download.html[Trino CLI tool] to access the Trino cluster. This link points to the latest Trino version. In this guide we keep Trino cluster and client versions in sync and download the CLI tool from the https://repo.stackable.tech/[Stackable repository]:
60+
We use the https://trino.io/download.html[Trino CLI tool] to access the Trino cluster.
61+
This link points to the latest Trino version.
62+
In this guide we keep Trino cluster and client versions in sync and download the CLI tool from the https://repo.stackable.tech/[Stackable repository]:
5863

5964
[source,bash]
6065
----
@@ -100,9 +105,12 @@ Congratulations, you set up your first Stackable Trino cluster successfully.
100105

101106
=== Access the Trino web interface
102107

103-
With the port-forward still active, you can connect to the Trino web interface. Enter `https://localhost:8443/ui` in your browser and login with the username `admin`. Since no authentication is enabled you do not need to enter a password.
108+
With the port-forward still active, you can connect to the Trino web interface.
109+
Enter `https://localhost:8443/ui` in your browser and login with the username `admin`.
110+
Since no authentication is enabled you do not need to enter a password.
104111

105-
WARNING: Your browser will probably show a security risk warning because it does not trust the self generated TLS certificates. Just ignore that and continue.
112+
WARNING: Your browser will probably show a security risk warning because it does not trust the self generated TLS certificates.
113+
Just ignore that and continue.
106114

107115
After logging in you should see the Trino web interface:
108116

docs/modules/trino/pages/getting_started/index.adoc

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
= Getting started
2+
:description: Get started with Trino on Kubernetes using the Stackable Operator. Follow steps for installation, setup, and resource recommendations.
23

3-
This guide will get you started with Trino using the Stackable Operator. It will guide you through the installation of the operator and its dependencies and setting up your first Trino cluster.
4+
This guide will get you started with Trino using the Stackable Operator.
5+
It will guide you through the installation of the operator and its dependencies and setting up your first Trino cluster.
46

57
== Prerequisites
68

docs/modules/trino/pages/getting_started/installation.adoc

+5-3
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Installation
2+
:description: Install the Stackable Operator for Trino using stackablectl or Helm. Includes optional setup for Hive, S3, and OPA integration.
23

34
On this page you will install the Stackable Operator for Trino as well as the commons, secret and listener operator which are
45
required by all Stackable Operators.
@@ -50,8 +51,8 @@ include::example$getting_started/code/getting_started.sh[tag=helm-install-operat
5051

5152
== Optional installation steps
5253

53-
Some Trino connectors like `hive` or `iceberg` work together with the Apache Hive metastore and S3 buckets. For these
54-
components extra steps are required.
54+
Some Trino connectors like `hive` or `iceberg` work together with the Apache Hive metastore and S3 buckets.
55+
For these components extra steps are required.
5556

5657
* a Stackable Hive metastore
5758
* an accessible S3 bucket
@@ -70,7 +71,8 @@ Please refer to the S3 provider.
7071

7172
=== Hive operator
7273

73-
Please refer to the xref:hive:index.adoc[Hive Operator] docs. Both Hive and Trino need the same S3 authentication.
74+
Please refer to the xref:hive:index.adoc[Hive Operator] docs.
75+
Both Hive and Trino need the same S3 authentication.
7476

7577
=== OPA operator
7678

docs/modules/trino/pages/index.adoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= Stackable Operator for Trino
2-
:description: The Stackable operator for Trino is a Kubernetes operator that can manage Trino clusters. Learn about its features, resources, dependencies and demos, and see the list of supported Trino versions.
2+
:description: Manage Trino clusters on Kubernetes with the Stackable operator, featuring resource management, demos, and support for custom Trino versions.
33
:keywords: Stackable operator, Trino, Kubernetes, k8s, operator, data science, data exploration
44
:trino: https://trino.io/
55
:github: https://github.com/stackabletech/trino-operator/

docs/modules/trino/pages/usage-guide/configuration.adoc

+17-9
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Configuration
2+
:description: Configure Trino clusters with properties, environment variables, and resource requests. Customize settings for performance and storage efficiently.
23

34
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
45

@@ -8,11 +9,11 @@ IMPORTANT: Do not override port numbers. This will lead to faulty installations.
89

910
For a role or role group, at the same level of `config`, you can specify `configOverrides` for:
1011

11-
- `config.properties`
12-
- `node.properties`
13-
- `log.properties`
14-
- `password-authenticator.properties`
15-
- `security.properties`
12+
* `config.properties`
13+
* `node.properties`
14+
* `log.properties`
15+
* `password-authenticator.properties`
16+
* `security.properties`
1617

1718
For a list of possible configuration properties consult the https://trino.io/docs/current/admin/properties.html[Trino Properties Reference].
1819

@@ -46,9 +47,13 @@ All override property values must be strings. The properties will be passed on w
4647

4748
=== The security.properties file
4849

49-
The `security.properties` file is used to configure JVM security properties. It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
50+
The `security.properties` file is used to configure JVM security properties.
51+
It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.
5052

51-
The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved. Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them. As of version 414, Trino performs poorly if the positive cache is disabled. To cache resolved host names, and thus speeding up queries you can configure the TTL of entries in the positive cache like this:
53+
The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved.
54+
Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them.
55+
As of version 414, Trino performs poorly if the positive cache is disabled.
56+
To cache resolved host names, and thus speeding up queries you can configure the TTL of entries in the positive cache like this:
5257

5358
[source,yaml]
5459
----
@@ -124,7 +129,9 @@ workers:
124129
capacity: 3Gi
125130
----
126131

127-
In the above example, all Trino workers in the default group will store data (the location of the property `--data-dir`) on a `3Gi` volume. Additional role groups not specifying any resources will inherit the config provided on the role level (`2Gi` volume). This works the same for memory or CPU requests.
132+
In the above example, all Trino workers in the default group will store data (the location of the property `--data-dir`) on a `3Gi` volume.
133+
Additional role groups not specifying any resources will inherit the config provided on the role level (`2Gi` volume).
134+
This works the same for memory or CPU requests.
128135

129136
By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a `2Gi` large local volume mount for the data location containing mainly logs.
130137

@@ -168,4 +175,5 @@ spec:
168175
capacity: '1Gi'
169176
----
170177

171-
WARNING: The default values are _most likely_ not sufficient to run a proper cluster in production. Please adapt according to your requirements.
178+
WARNING: The default values are _most likely_ not sufficient to run a proper cluster in production.
179+
Please adapt according to your requirements.

docs/modules/trino/pages/usage-guide/connect_to_trino.adoc

+6-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Connecting to Trino
2+
:description: Learn how to connect to Trino using trino-cli, DBeaver, or Python. Includes setup for SSL/TLS, OpenID Connect, and basic authentication.
23

34
:trino-jdbc: https://trino.io/docs/current/client/jdbc.html
45
:starburst-odbc: https://docs.starburst.io/data-consumer/clients/odbc.html
@@ -29,7 +30,9 @@ The `--insecure` flag ignores the server TLS certificate and is required in this
2930
$ java -jar ~/Downloads/trino-cli-403-executable.jar --server https://85.215.195.29:8443 --user admin --password --insecure
3031
----
3132

32-
TIP: In case you are using OpenID connect, use `--external-authentication` instead of `--password`. A browser window will be opened, which might require you to log in. Please note that you still need to pass the `--user` argument because of https://github.com/trinodb/trino/issues/11547[this Trino issue].
33+
TIP: In case you are using OpenID connect, use `--external-authentication` instead of `--password`.
34+
A browser window will be opened, which might require you to log in.
35+
Please note that you still need to pass the `--user` argument because of https://github.com/trinodb/trino/issues/11547[this Trino issue].
3336

3437
== Connect with DBeaver
3538

@@ -53,7 +56,8 @@ image::connect-with-dbeaver-3.png[]
5356

5457
As the last step you can click on _Finish_ and start using the Trino connection.
5558

56-
TIP: In case you are using OpenID connect, set the `externalAuthentication` property to `true` and don't provide and username or password. A browser window will be opened, which might require you to log in.
59+
TIP: In case you are using OpenID connect, set the `externalAuthentication` property to `true` and don't provide and username or password.
60+
A browser window will be opened, which might require you to log in.
5761

5862
== Connect with Python
5963

docs/modules/trino/pages/usage-guide/log_aggregation.adoc

+5-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= Log aggregation
2+
:description: The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent
23

3-
The logs can be forwarded to a Vector log aggregator by providing a discovery
4-
ConfigMap for the aggregator and by enabling the log agent:
4+
The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent:
55

66
[source,yaml]
77
----
@@ -19,7 +19,7 @@ spec:
1919
level: INFO
2020
----
2121

22-
Currently, the logs are collected only for `server.log`. Logging for `http-request.log` is disabled by default.
22+
Currently, the logs are collected only for `server.log`.
23+
Logging for `http-request.log` is disabled by default.
2324

24-
Further information on how to configure logging, can be found in
25-
xref:concepts:logging.adoc[].
25+
Further information on how to configure logging, can be found in xref:concepts:logging.adoc[].
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Monitoring
2+
:description: The managed Trino instances are automatically configured to export Prometheus metrics.
23

3-
The managed Trino instances are automatically configured to export Prometheus metrics. See
4-
xref:operators:monitoring.adoc[] for more details.
4+
The managed Trino instances are automatically configured to export Prometheus metrics.
5+
See xref:operators:monitoring.adoc[] for more details.

docs/modules/trino/pages/usage-guide/query.adoc

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
= Testing Trino with Hive and S3
2+
:description: Test Trino with Hive and S3 by creating a schema and table for Iris data in Parquet format, then querying the dataset.
23

3-
Create a schema and a table for the Iris data located in S3 and query data. This assumes to have the Iris data set in the `PARQUET` format available in the S3 bucket which can be downloaded https://www.kaggle.com/gpreda/iris-dataset/version/2?select=iris.parquet[here].
4+
Create a schema and a table for the Iris data located in S3 and query data.
5+
This assumes to have the Iris data set in the `PARQUET` format available in the S3 bucket which can be downloaded https://www.kaggle.com/gpreda/iris-dataset/version/2?select=iris.parquet[here].
46

57
== Create schema
68
[source,sql]

docs/modules/trino/pages/usage-guide/s3.adoc

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
= Connecting Trino to S3
2+
:description: Configure S3 connections in Trino either inline within the TrinoCatalog or via an external S3Connection resource for centralized management.
23

34
You can specify S3 connection details directly inside the TrinoCatalog specification or by referring to an external S3Connection custom resource.
45
This mechanism used used across the whole Stackable Data Platform, read the xref:concepts:s3.adoc[S3 concepts page] to learn more.

0 commit comments

Comments
 (0)