Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update upgrade-guide-from-monitoring-3.x-to-monitoring-4.y.rst #1773

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

doctorg-ml
Copy link

The propose changes attempts to correct the following:

  1. In Section Restart the monitoring stack, the start-all.sh was misspelled.

  2. In Section Migrating – Backup, the word Prometheus was misspelled.

  3. In Section Backfilling Process - backup the paragraph contains duplicate sentences, moreover, the idea is not clear.

  4. The section about Create the Data files is not clear. Moreover, the word epoch was misspelled. The note also contained some errors, for example, the word interrupted.

  5. In the section copy the data files, the docker cp was not specified.

The propose changes attempts to correct the following:

1. In Section Restart the monitoring stack, the start-all.sh was misspelled.

2. In Section Migrating – Backup, the word Prometheus was misspelled.

3. In Section Backfilling Process - backup the paragraph contains duplicate sentences, moreover, the idea is not clear.

4. The section about Create the Data files is not clear. Moreover, the word epoch was misspelled. The note also contained some errors, for example, the word interrupted.

5. In the section copy the data files, the docker cp was not specified.
@@ -99,8 +99,7 @@ We assume that you are using external volume to store the metrics data.
Backup
^^^^^^

We suggest to copy the Prometheus external directory first and use the copy as the data directory for the new monitoring stack.
Newer Monitoring stack uses newer Promethues versions, and keeping a backup of the prometheus dir would allow you to rollback.
A copy of the Prometheus external directory should be made first and used as the data directory for the new monitoring stack. The new monitoring stack uses newer versions of Prometheus and keeping a backup would enable you to rollback to a previous version of Prometheus.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move to passive voice? I find the original version clearer.
You switch to singular, but we are talking about versions (i.e. 3.x and 4.y)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the passive voice, we can use the following paragraph:

We suggest to make a copy of Prometheus' external directory and use it as the data directory for the new monitoring stack. The new monitoring stack uses newer versions of Prometheus and keeping a backup would enable you to rollback to a previous version of Prometheus.

^^^^^^
If you have a long retention period you are using an external directory that holds the Prometheus data, back it up, in case
If you have a long retention period, you are using an external directory that holds the Prometheus data back it up; if something goes wrong in the process, you can revert the process.
Backup any external directory containing Prometheus data; if something goes wrong, you can revert the changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not any, it the external data directory

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modified sentence would be:

Backup the external directory containing Prometheus data; if something goes wrong, you can revert the changes.

If you have a long retention period, you are using an external directory that holds the Prometheus data back it up; if something goes wrong in the process, you can revert the process.
Backup any external directory containing Prometheus data; if something goes wrong, you can revert the changes.

The monitoring stack will need to be restarted at least once. This process cannot be completed without an external directory when using the -d command-line option.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing, see the original description (droping the duplication).
To complete this process, you need to have the data in an exteranl directory, the way you use an external directory is with the -d command line option.

@@ -226,13 +226,9 @@ You need to stop the monitoring stack and run the ``stat-all.sh`` command with a

Create the data files
^^^^^^^^^^^^^^^^^^^^^^^^^
We will use the Promtool utility; it's already installed for you if you are using the docker container.
You will need the start time and end time for the process, in our example the start time is 360 days ago and the end time is 90 days ago.
We will create the data files using the Promtool utility, which has been installed in the docker container. In order to run the utility, the start time and end times must be passed in epoch format. For example, suppose the start and end times are 360 and 90 days back, then the following commands can be used to transform those numbers to epoch: ``echo $((`date +%s` - 3600*24*90))`` and ``echo $((`date +%s` - 3600*24*360))``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true only if you are using the docker container.
I prefer the original description that breaks the explenation into stages, you need a start and end time.
you need them in epoch,
there are 100 different ways of transling time to epoch, this is just one example of how to do it from the command line

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new paragraph would be:

We will create the data files using the Promtool utility, which has been installed in the docker container. In order to run the utility, the start time and end times must be passed in epoch format. For example, suppose the start and end times are 360 and 90 days back, then the following commands can be used to transform those numbers to epoch: echo $((`date +%s` - 3600*24*90)) and echo $((`date +%s` - 3600*24*360)) It is important to note that the previous example is just one of many ways to convert time to epoch.

@@ -244,24 +240,21 @@ Log in to your docker container and run the following (``start`` and ``end`` sho
--url http://localhost:9090 \
/etc/prometheus/prom_rules/back_fill/3.8/rules.1.yml

It will create a ``data`` directory in the directory where you run it.
The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
A ``data`` directory will be created in the directory where you run the previous commands. The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why switch to passive voice? I find the the longer explenation confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@@ -99,8 +99,7 @@ We assume that you are using external volume to store the metrics data.
Backup
^^^^^^

We suggest to copy the Prometheus external directory first and use the copy as the data directory for the new monitoring stack.
Newer Monitoring stack uses newer Promethues versions, and keeping a backup of the prometheus dir would allow you to rollback.
A copy of the Prometheus external directory should be made first and used as the data directory for the new monitoring stack. The new monitoring stack uses newer versions of Prometheus and keeping a backup would enable you to rollback to a previous version of Prometheus.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A copy of the Prometheus external directory should be made first and used as the data directory for the new monitoring stack. The new monitoring stack uses newer versions of Prometheus and keeping a backup would enable you to rollback to a previous version of Prometheus.
We suggest making a copy of Prometheus's external directory to use as the data directory for the new version of Monitoring Stack. The new version of Monitoring Stack uses the new version of Prometheus. If you keep a backup of Prometheus's external directory, you can roll back to the previous Prometheus version.

@@ -244,24 +240,21 @@ Log in to your docker container and run the following (``start`` and ``end`` sho
--url http://localhost:9090 \
/etc/prometheus/prom_rules/back_fill/3.8/rules.1.yml

It will create a ``data`` directory in the directory where you run it.
The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
A ``data`` directory will be created in the directory where you run the previous commands. The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A ``data`` directory will be created in the directory where you run the previous commands. The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
It will create a ``data`` directory in the directory where you run it. The reason to run it under the ``/prometheus/data/`` is to ensure that Prometheus has write privileges to the directory.

@annastuchlik
Copy link
Contributor

@lujogre Thanks for opening this PR to improve the docs!
I've addressed the comments added by @amnonh and rewritten some sections. I've added them as suggestions, so all you need to do now is to commit them wit the "Commit suggestion" option and ask @amnonh to re-review.

In addition, could you fix the name of the script in the "Restart the monitoring stack" section?
Now is: stat-all.sh
Should be: start-all.sh

@annastuchlik annastuchlik added the documentation Documentation related label Aug 4, 2022
doctorg-ml and others added 9 commits August 4, 2022 09:23
…oring-4.y.rst


We recommend making a copy of the external data directory of Prometheus and using it as the new data directory.    Since the newer Monitoring stack uses newer Prometheus versions, you can roll back to an earlier version if you have backup of Prometheus' external directory.

Co-authored-by: Anna Stuchlik <[email protected]>
…oring-4.y.rst


Backup the external directory containing Prometheus data, in case anything goes wrong.

Co-authored-by: Anna Stuchlik <[email protected]>
…oring-4.y.rst


- To complete the process you will need to restart the monitoring stack at least once. If you are not using an external directory (The ``-d`` command-line option) you cannot complete the process.

Co-authored-by: Anna Stuchlik <[email protected]>
@doctorg-ml
Copy link
Author

Hello @annastuchlik / @amnonh , I have completed my comments. Should you have any other observation, please let me know.

@annastuchlik
Copy link
Contributor

@lujogre LGTM, thanks!
@amnonh Could you re-review this PR and merge it?

@@ -135,7 +134,7 @@ Rollback to version 3.x
To rollback during the testing mode, follow `Killing the new 4.y Monitoring stack in testing mode`_ as explained previously
and the system will continue to operate normally.

To rollback to version 3.x after you completed moving to version 4.y (as shown above), run:
To rollback to version 3.x, run the following commands after you have completed moving to version 4.y (as shown above):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's two options for rollback depends were you are at in the process, this is when you complete the upgrade,
the new version doesn't say what it should say

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lujogre Here you need to restore the previous version because the meaning was changed (and I misunderstood the instructions after the change).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @annastuchlik , thanks for clarification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still wrong, you can roll back at any time, but the way you roll back is different if you have completed moving to 4.0 or if you are still testing.

This is the case after you completed

show the data, the latency graphs have a missing period, in our example - from the entire year, the latency graph will only show the last three months.

That nine months gap (12 months minus 3) is what we want to fill with back-filling.
display the data, the latency graphs are missing a period of time. The latency graph will only show the last three months in our example from the entire year. That nine months gap (12 months minus 3) is what we want to fill with back-filling.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this confusing, all those durations are a real world example, it's not always going to be that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnonh I would leave it as is, because the entire "Determine the end time" section is an example - and it is emphasized in the first paragraph of that section ("for example, you have a year of retention data") and in the following ones ("in our example 1 year"). For sure, it's clearer than the previous version.
If we want to make it clearer, the content reorganization would help:

Typically, you need to back-fill the recording rules when using a long retention period.

Example:

You have a year of retention data, and you upgraded to ScyllaDB Monitoring 3.8 about three months ago. If you open the Overview dashboard and look at your entire retention time (one year), you will see that while most of the graphs do display the data, the latency graphs are missing a period of time. The latency graph will only show the last three months from the entire year. That nine months gap (12 months minus 3) is what we want to fill with back-filling.

If you have a long retention period, you are using an external directory that holds the Prometheus data back it up; if something goes wrong in the process, you can revert the process.
Backup the external directory containing Prometheus data; if something goes wrong, you can revert the changes.

To complete the process, you must restart Monitoring Stack at least once. You cannot complete the process without providing the path to the external directory with Prometheus data using the ``-d`` command line option.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to use an external data directory, the -d is the way to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnonh I think this is what the line says. How would you rephrase it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot complete the process without using Prometheus external directory. You can set Prometheus external directory with the -d command line option.


Create the data files
^^^^^^^^^^^^^^^^^^^^^^^^^
We will use the Promtool utility; it's already installed for you if you are using the docker container.
You will need the start time and end time for the process, in our example the start time is 360 days ago and the end time is 90 days ago.
We will create the data files using the Promtool utility, which has been installed in the Docker container. To run the utility, you must pass the start time and end time in the epoch format. The following example shows one of the ways to convert the times to epoch when the start time is 360 and the end time is 90 days ago:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it clearer, it's insalled in the prometheus docker container

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We will create the data files using the Promtool utility, which is installed in the Prometheus Docker container". - Is that what you mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @annastuchlik / @amnonh , the paragraph could be:

We will create the data files using the Promtool utility, which is installed in the Prometheus Docker container. To run the utility, you must pass the start time and end time in the epoch format. The following example shows one of the ways to convert the times to epoch when the start time is 360 and the end time is 90 days ago:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @annastuchlik , great, thanks for feedback.


Restart the monitoring stack
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You need to stop the monitoring stack and run the ``stat-all.sh`` command with an additional flag:
You need to stop the monitoring stack and run the ``start-all.sh`` command with an additional flag (The 365d retention time used here is used only as an example):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional flag is "--storage.tsdb.allow-overlapping-blocks" it should be clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the flag name:

You need to stop the monitoring stack and run the start-all.sh command with an additional flag --storage.tsdb.allow-overlapping-blocks. In the following command, the 365d retention time used is used as an example:


Create the data files
^^^^^^^^^^^^^^^^^^^^^^^^^
We will use the Promtool utility; it's already installed for you if you are using the docker container.
You will need the start time and end time for the process, in our example the start time is 360 days ago and the end time is 90 days ago.
We will create the data files using the Promtool utility, which is installed in the Prometheus Docker container. To run the utility, you must pass the start time and end time in the epoch format. The following example shows one of the ways to convert the times to epoch when the start time is 360 and the end time is 90 days ago:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

360 days ago

@@ -244,20 +240,15 @@ Log in to your docker container and run the following (``start`` and ``end`` sho
--url http://localhost:9090 \
/etc/prometheus/prom_rules/back_fill/3.8/rules.1.yml

It will create a ``data`` directory in the directory where you run it.
The reason to run it under the ``/prometheus/data/`` is you can be sure Prometheus has write privileges there.
The previous bash script will create a ``data`` directory in the directory where it is executed. The reason to run the bash script under the ``/prometheus/data/`` is to ensure Prometheus has write privileges to the directory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a bash script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants