Skip to content

Commit

Permalink
[AMG] Migrate (#7802)
Browse files Browse the repository at this point in the history
* feat: initial migration support using backup & restore commands.

* create backup_core to separate backup & make migrate

* port over dashboard for migrate and backup.

* rename get_dashboards to get_all_dashboards

* move over library panels

* in backup_core, support skipping external snapshots, only applicable for migrate (non-amg grafana instances)

* refactor snapshots between backup_core & use in migrate/backup

* feat: change backup_core to have get_folders and have migrate and backup use it.

* Refactor backup_core to include get_all_annotations and update migrate and backup to use it

* refactor backup_core to have get, which returns and migrate and backup uses.

* refactor: restore of dashboards

* refactor: make create_folder function to be used by migrate and restore.

* refactor: library panels

* refactor: create_snapshot function for migrate & restore.

* refactor: create_annotation for restore & migrate

* feat: datasources & mapping, fully port everything over to migrate.

* delete backup & restore from migrate.

* Add dry_run flag to migrate & only create new folders.

* add dry_run logic and not sending request to remake datasource when we already have.

* add in summary

* refactor: sync library panels during migration

* Refactor migrate function to have dry run & summary

* refactor migrate into having it's own functions.

* feat: Add  override flags to migrate_grafana

* add overwrite support for folders and library panels

* change override to overwrite to be more consistent.

* update the summary text so that it better reflects what happens

* update library panel summary text to be correct.

* add better summary texts for dashboards.

* create new migrate test by copy and pasting the backup and restore testing function

* e2e except folders seem to work

* fix folder issue

* move the migrate tests to a new file.

* start writing the unit tests

* migrate dry run flag test works

* fix dashboard version edge case & finish writing all the tests

* All recordings

* Add datasources warning

* fix edge case for library panels & folders & dashboards, depending on included / excluded.

* cleanup dry run summary texts

* implement PR comment changes

* add in support for --overwrite snapshots

* add in support for --overwrite for annotations

* Add in Jeremy's suggestions: refactor summary functions into more summarized and got rid of excess code

* Update test recordings

* Add header back to backup.py & change imports back to normal. Delete debugging logging. Update backup_core imports to remove useless ones.

* Clean up migrate logic

* Make backup_core more consistent

* Use update_summary / update_summary_dict in migrate instead of duplicate code.

* run autopep8 to fix style issues & manually fix some

* fix most azdev style issues, dealing with too-many-locals

* disable too-many-locals for migrate function

* Update recordings

* add in accidentally removed prints

* change variable names, update HISTORY,  make return more explict

* fix remapping test

* Update test recordings

* Add print statements

* use different method to check for library panels for dashboards

* Update tests to mock search_annotations.

* Update recordings

* Delete resources after test complete.

* Update recordings

* add in copyright & get rid of useles imports and get rid of useless f string.

* Change help message

* do health check of source_url before calling the migrate function

* Add check for same folders in include & exclude. Update valid folders to include/exclude

* fix a few minor bugs

* re-record the tests

* redo recordings

* fix style issues on custom.py (removing converting fstrings to strings & elif to if)

* add 'meta' to  fix an Grafana 8 migrate bug.

* azdev style line to long fix

* update history.rst

* rename migrate variables

* cleanup the valid folder uids general edge case to be cleaner

* fix indenting from azdev style amg
  • Loading branch information
leozhang-msft authored Aug 15, 2024
1 parent 2c16b96 commit be21ab7
Show file tree
Hide file tree
Showing 23 changed files with 117,115 additions and 36,435 deletions.
4 changes: 4 additions & 0 deletions src/amg/HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,7 @@ Release History
* `az grafana list`: Migrate to AAZDev Tool
* `az grafana show`: Migrate to AAZDev Tool
* `az grafana delete`: Migrate to AAZDev Tool

2.1.0
++++++
* `az grafana migrate`: migrate data from a self-hosted Grafana instance to Azure Managed Grafana instance
9 changes: 9 additions & 0 deletions src/amg/azext_amg/_help.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,15 @@
az grafana restore -g MyResourceGroup -n MyGrafana --archive-file backup\\dashboards\\ServiceHealth-202307051036.tar.gz --components dashboards folders --remap-data-sources
"""

helps['grafana migrate'] = """
type: command
short-summary: Migrate an existing Grafana instance to an Azure Managed Grafana instance.
examples:
- name: Migrate dashboards and folders from a local Grafana instance to an Azure Managed Grafana instance.
text: |
az grafana migrate -g MyResourceGroup -n MyGrafana -s http://localhost:3000 -t YourServiceTokenOrAPIKey
"""


helps['grafana data-source'] = """
type: group
Expand Down
6 changes: 6 additions & 0 deletions src/amg/azext_amg/_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,12 @@ def load_arguments(self, _):
c.argument("remap_data_sources", options_list=["-r", "--remap-data-sources"], arg_type=get_three_state_flag(),
help="during restoration, update dashboards to reference data sources defined at the destination workspace through name matching")

with self.argument_context("grafana migrate") as c:
c.argument("source_grafana_endpoint", options_list=["-s", "--src-endpoint"], help="Grafana instance endpoint to migrate from")
c.argument("source_grafana_token_or_api_key", options_list=["-t", "--src-token-or-key"], help="Grafana instance service token (or api key) to get access to migrate from")
c.argument("dry_run", options_list=["-d", "--dry-run"], arg_type=get_three_state_flag(), help="Preview changes without committing. Takes priority over --overwrite.")
c.argument("overwrite", options_list=["--overwrite"], arg_type=get_three_state_flag(), help="Overwrite previous dashboards, library panels, and folders with the same uid or title")

with self.argument_context("grafana dashboard") as c:
c.argument("uid", options_list=["--dashboard"], help="dashboard uid")
c.argument("title", help="title of a dashboard")
Expand Down
287 changes: 59 additions & 228 deletions src/amg/azext_amg/backup.py

Large diffs are not rendered by default.

237 changes: 237 additions & 0 deletions src/amg/azext_amg/backup_core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
# --------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
# --------------------------------------------------------------------------------------------

import time

from knack.log import get_logger

from .utils import search_dashboard, get_dashboard
from .utils import search_library_panels
from .utils import search_snapshot, get_snapshot
from .utils import search_folders, get_folder, get_folder_permissions
from .utils import search_datasource
from .utils import search_annotations

logger = get_logger(__name__)


def get_all_dashboards(grafana_url, http_headers, **kwargs):
limit = 5000 # limit is 5000 above V6.2+
current_page = 1

all_dashboards = []

# Go through all the pages, we are unsure how many pages there are
while True:
dashboards = _get_all_dashboards_in_grafana(current_page, limit, grafana_url, http_headers)

# only include what users want
folders_to_include = kwargs.get('folders_to_include')
folders_to_exclude = kwargs.get('folders_to_exclude')
if folders_to_include:
folders_to_include = [f.lower() for f in folders_to_include]
dashboards = [d for d in dashboards if (d.get('folderTitle', '').lower() in folders_to_include or
not d.get('folderTitle', '') and 'general' in folders_to_include)]
if folders_to_exclude:
folders_to_exclude = [f.lower() for f in folders_to_exclude]
dashboards = [d for d in dashboards if ((d.get('folderTitle', '')
and d.get('folderTitle', '').lower() not in folders_to_exclude)
or
(not d.get('folderTitle', '')
and 'general' not in folders_to_exclude))]

print_an_empty_line()
if len(dashboards) == 0:
break
current_page += 1
current_run_dashboards = _get_individual_dashboard_setting(dashboards, grafana_url, http_headers)
# add the previous list to the list where we added everything.
all_dashboards += current_run_dashboards
print_an_empty_line()

return all_dashboards


def _get_all_dashboards_in_grafana(page, limit, grafana_url, http_headers):
(status, content) = search_dashboard(page,
limit,
grafana_url,
http_headers)
if status == 200:
dashboards = content
logger.info("There are %s dashboards:", len(dashboards))
for board in dashboards:
logger.info('name: %s', board['title'])
return dashboards
logger.warning("Get dashboards FAILED, status: %s, msg: %s", status, content)
return []


def _get_individual_dashboard_setting(dashboards, grafana_url, http_headers):
if not dashboards:
return []

all_individual_dashboards = []
for board in dashboards:
board_uri = "uid/" + board['uid']

(status, content) = get_dashboard(board_uri, grafana_url, http_headers)
if status == 200:
# do not back up provisioned dashboards
if content['meta']['provisioned']:
logger.warning("Dashboard: \"%s\" is provisioned, skipping...", board['title'])
continue

all_individual_dashboards.append(content)

return all_individual_dashboards


def get_all_library_panels(grafana_url, http_headers):
all_panels = []
current_page = 1
while True:
panels = _get_all_library_panels_in_grafana(current_page, grafana_url, http_headers)

print_an_empty_line()
if len(panels) == 0:
break
current_page += 1

# Since we are not excluding anything. We can just add the panels to the
# list since this is all the data we need.
all_panels += panels
print_an_empty_line()

return all_panels


def _get_all_library_panels_in_grafana(page, grafana_url, http_headers):
(status, content) = search_library_panels(page, grafana_url, http_headers)
if status == 200:
library_panels = content
logger.info("There are %s library panels:", len(library_panels))
for panel in library_panels:
logger.info('name: %s', panel['name'])
return library_panels
logger.warning("Get library panel FAILED, status: %s, msg: %s", status, content)
return []


def get_all_snapshots(grafana_url, http_headers):
(status, content) = search_snapshot(grafana_url, http_headers)

if status != 200:
logger.warning("Query snapshot failed, status: %s, msg: %s", status, content)
return []

all_snapshots_metadata = []
for snapshot in content:
if not snapshot['external']:
all_snapshots_metadata.append(snapshot)
else:
logger.warning("External snapshot: %s is skipped", snapshot['name'])

logger.info("There are %s snapshots:", len(all_snapshots_metadata))

all_snapshots = []
for snapshot in all_snapshots_metadata:
logger.info(snapshot)

(individual_status, individual_content) = get_snapshot(snapshot['key'], grafana_url, http_headers)
if individual_status == 200:
all_snapshots.append((snapshot['key'], individual_content))
else:
logger.warning("Getting snapshot %s FAILED, status: %s, msg: %s",
snapshot['name'], individual_status, individual_content)

return all_snapshots


def get_all_folders(grafana_url, http_headers, **kwargs):
folders = _get_all_folders_in_grafana(grafana_url, http_get_headers=http_headers)

# only include what users want
folders_to_include = kwargs.get('folders_to_include')
folders_to_exclude = kwargs.get('folders_to_exclude')
if folders_to_include:
folders_to_include = [f.lower() for f in folders_to_include]
folders = [f for f in folders if f.get('title', '').lower() in folders_to_include]
if folders_to_exclude:
folders_to_exclude = [f.lower() for f in folders_to_exclude]
folders = [f for f in folders if f.get('title', '').lower() not in folders_to_exclude]

individual_folders = []
for folder in folders:
(status_folder_settings, content_folder_settings) = get_folder(folder['uid'], grafana_url, http_headers)
# TODO: get_folder_permissions usually requires admin permission but we
# don't save the permissions in backup or migrate. Figure out what to do.
(status_folder_permissions, content_folder_permissions) = get_folder_permissions(folder['uid'],
grafana_url,
http_headers)
if status_folder_settings == 200 and status_folder_permissions == 200:
individual_folders.append((content_folder_settings, content_folder_permissions))
else:
logger.warning("Getting folder %s FAILED", folder['title'])
logger.info("settings status: %s, settings content: %s, permissions status: %s, permissions content: %s",
status_folder_settings,
content_folder_settings,
status_folder_permissions,
content_folder_permissions)

return individual_folders


def _get_all_folders_in_grafana(grafana_url, http_get_headers):
status_and_content_of_all_folders = search_folders(grafana_url, http_get_headers)
status = status_and_content_of_all_folders[0]
content = status_and_content_of_all_folders[1]
if status == 200:
folders = content
logger.info("There are %s folders:", len(content))
for folder in folders:
logger.info("name: %s", folder['title'])
return folders
logger.warning("Get folders FAILED, status: %s, msg: %s", status, content)
return []


def get_all_annotations(grafana_url, http_headers):
all_annotations = []
now = int(round(time.time() * 1000))
one_month_in_ms = 31 * 24 * 60 * 60 * 1000

ts_to = now
ts_from = now - one_month_in_ms
thirteen_months_retention = now - (13 * one_month_in_ms)

while ts_from > thirteen_months_retention:
(status, content) = search_annotations(grafana_url, ts_from, ts_to, http_headers)
if status == 200:
annotations_batch = content
logger.info("There are %s annotations:", len(annotations_batch))
all_annotations += annotations_batch
else:
logger.warning("Query annotation FAILED, status: %s, msg: %s", status, content)

ts_to = ts_from
ts_from = ts_from - one_month_in_ms

return all_annotations


def get_all_datasources(grafana_url, http_headers):
(status, content) = search_datasource(grafana_url, http_headers)
if status == 200:
datasources = content
logger.info("There are %s datasources:", len(datasources))
return datasources

logger.info("Query datasource FAILED, status: %s, msg: %s", status, content)
return None


def print_an_empty_line():
logger.info('')
1 change: 1 addition & 0 deletions src/amg/azext_amg/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ def load_command_table(self, _):
self.command_table['grafana update'] = GrafanaUpdate(loader=self)
g.custom_command('backup', 'backup_grafana', is_preview=True)
g.custom_command('restore', 'restore_grafana', is_preview=True)
g.custom_command('migrate', 'migrate_grafana', is_preview=True)

with self.command_group('grafana dashboard') as g:
g.custom_command('create', 'create_dashboard')
Expand Down
39 changes: 39 additions & 0 deletions src/amg/azext_amg/custom.py
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,45 @@ def restore_grafana(cmd, grafana_name, archive_file, components=None, remap_data
destination_datasources=data_sources)


def migrate_grafana(cmd, grafana_name, source_grafana_endpoint, source_grafana_token_or_api_key, dry_run=False,
overwrite=False, folders_to_include=None, folders_to_exclude=None, resource_group_name=None):
from .migrate import migrate
from .utils import get_health_endpoint, send_grafana_get

# for source instance (backing up from)
headers_src = {
"content-type": "application/json",
"authorization": "Bearer " + source_grafana_token_or_api_key
}
(status, _) = get_health_endpoint(source_grafana_endpoint, headers_src)
if status == 400:
# https://github.com/grafana/grafana/pull/27536
# Some Grafana instances might block/not support "/api/health" endpoint
(status, _) = send_grafana_get(f"{source_grafana_endpoint}/healthz", headers_src)

if status == 401:
raise ArgumentUsageError("Access to source grafana endpoint was denied")
if status >= 400:
raise ArgumentUsageError("Source grafana endpoint is not reachable")

# for destination instance (restoring to)
_health_endpoint_reachable(cmd, grafana_name, resource_group_name=resource_group_name)
creds_dest = _get_data_plane_creds(cmd, api_key_or_token=None, subscription=None)
headers_dest = {
"content-type": "application/json",
"authorization": "Bearer " + creds_dest[1]
}

migrate(backup_url=source_grafana_endpoint,
backup_headers=headers_src,
restore_url=_get_grafana_endpoint(cmd, resource_group_name, grafana_name, subscription=None),
restore_headers=headers_dest,
dry_run=dry_run,
overwrite=overwrite,
folders_to_include=folders_to_include,
folders_to_exclude=folders_to_exclude)


def sync_dashboard(cmd, source, destination, folders_to_include=None, folders_to_exclude=None,
dashboards_to_include=None, dashboards_to_exclude=None, dry_run=None):
from .sync import sync
Expand Down
Loading

0 comments on commit be21ab7

Please sign in to comment.