Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 18 additions & 26 deletions docs/deployment/logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,9 @@ title: Logs

# Logs

See a research for [Logs aggregation](https://github.com/packit/research/tree/main/logs-aggregation).
See a research for [Logs aggregation](https://packit.dev/research/monitoring/logs-aggregation).

Each worker pod has a sidecar container running [Fluentd](https://docs.fluentd.org),
which is a data collector allowing us to get the logs from a worker via
[syslog](https://docs.fluentd.org/input/syslog) and send them to Splunk.

We use [our fluentd-splunk-hec image](https://quay.io/repository/packit/fluentd-splunk-hec),
built via [a workflow](https://github.com/packit/fluent-plugin-splunk-hec/blob/main/.github/workflows/rebuild-and-push-image.yml)
because we don't want to use [docker.io/splunk/fluentd-hec image](https://hub.docker.com/r/splunk/fluentd-hec).
We are following the first solution described in this [document](https://source.redhat.com/departments/it/devit/it-infrastructure/itcloudservices/itocp/it_paas_kb/logging_to_splunk_on_managed_platform), _logging to stdout_ with no need for a forwarder sidecar pod.

## Where do I find the logs?

Expand All @@ -21,33 +15,31 @@ First, you have to [get access to Splunk](https://source.redhat.com/departments/

Then go to https://rhcorporate.splunkcloud.com → `Search & Reporting`

You should be able to see some logs using [this query](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker*.log"):

index="rh_paas" source="/var/log/containers/packit-worker*.log"

If the above query doesn't return any results, [request access](https://source.redhat.com/departments/it/splunk/splunk_wiki/faq#jive_content_id_How_do_I_request_access_to_additional_data_sets_in_Splunk) to `rh_paas` index.
Comment thread
mfocko marked this conversation as resolved.
Outdated

:::caution

If you cannot see _Access to Additional Datasets_ (as suggested by the instructions), use _Update Permissions_ as the _Request Type_ and ask to access the `rh_paas` index in the additional details.

:::

[The more specific search, the faster it'll be](https://source.redhat.com/departments/it/splunk/splunk_wiki/splunk_training_search_best_practices#jive_content_id_Be_more_specific).
At least, specify `index`, `source` and `msgid`.
You can start with [this search ](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3Drh_linux%20source%3Dsyslog%20msgid%3Dpackit-prod)
At least, specify `index`, `source`.
You can start with [this search ](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker*.log%22%20NOT%20pidbox)
and tune it from there.
For example:

- change `msgid=packit-prod` to service instance you want to see logs from, e.g. to `msgid=packit-stg` or `msgid=stream-prod`
- add `| search message!="pidbox*"` to remove the ["pidbox received method" message which Celery pollutes the log with](https://stackoverflow.com/questions/43633914/pidbox-received-method-enable-events-reply-tonone-ticketnone-in-django-cel)
- add `| reverse` if you want to se the results from oldest to newest
- add `| fields _time, message | fields - _raw` to leave only time and message fields
- add `| fields _raw | fields - _time` to leave only message field without timestamp duplication

All in one URL [here](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3Drh_linux%20source%3Dsyslog%20msgid%3Dpackit-prod%20%7C%20search%20message!%3D%22pidbox*%22%20%7C%20reverse%20%7C%20fields%20_time%2C%20message%20%7C%20fields%20-%20_raw) -
now just export it to csv; and you have almost the same log file
All in one URL [here](https://rhcorporate.splunkcloud.com/en-US/app/search/search?q=search%20index%3D%22rh_paas%22%20source%3D%22%2Fvar%2Flog%2Fcontainers%2Fpackit-worker-short-running-0_packit--stg_packit-worker-*.log%22%20%7C%20fields%20_raw%20%7C%20fields%20-%20_time%20%7C%20reverse) - now just export it to csv; and you have almost the same log file
as you'd get by exporting logs from a worker pod.

For more info, see (Red Hat internal):

- [demo](https://drive.google.com/file/d/15BIsRl7fP9bPdyLBQvoljF2yHy52ZqHm)
- [Splunk wiki @ Source](https://source.redhat.com/departments/it/splunk)

## Debugging

To see the sidecar container logs, select a worker pod → `Logs` → `fluentd-sidecar`.

To [manually send some event to Splunk](https://docs.splunk.com/Documentation/SplunkCloud/8.2.2203/Data/UsetheHTTPEventCollector#Send_data_to_HTTP_Event_Collector)
try this (get the host & token from Bitwarden):

$ curl -v "https://${SPLUNK_HEC_HOST}:443/services/collector/event" \
-H "Authorization: Splunk ${SPLUNK_HEC_TOKEN}" \
-d '{"event": "jpopelkastest"}'
1 change: 0 additions & 1 deletion openshift/dashboard.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,6 @@ spec:
name: {{ image_dashboard }}
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
# DOES NOT WORK on Openshift Online.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
Expand Down
23 changes: 22 additions & 1 deletion openshift/nginx.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx
annotations:
# https://docs.openshift.com/container-platform/4.11/openshift_images/triggering-updates-on-imagestream-changes.html
image.openshift.io/triggers: >-
[{"from":{"kind":"ImageStreamTag","name":"nginx:{{ deployment }}"},"fieldPath":"spec.template.spec.containers[?(@.name==\"nginx\")].image"}]
spec:
selector:
matchLabels:
Expand All @@ -30,7 +34,7 @@ spec:
secretName: flower-htpasswd
containers:
- name: nginx
image: ghcr.io/nginxinc/nginx-unprivileged
image: nginx:{{ deployment }}
ports:
- containerPort: 8443
volumeMounts:
Expand Down Expand Up @@ -135,3 +139,20 @@ spec:
tls:
insecureEdgeTerminationPolicy: Redirect
termination: passthrough
---
kind: ImageStream
apiVersion: image.openshift.io/v1
metadata:
name: nginx
spec:
tags:
- name: {{ deployment }}
from:
kind: DockerImage
name: ghcr.io/nginxinc/nginx-unprivileged
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
local: true
23 changes: 22 additions & 1 deletion openshift/pushgateway.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ kind: Deployment
apiVersion: apps/v1
metadata:
name: pushgateway
annotations:
# https://docs.openshift.com/container-platform/4.11/openshift_images/triggering-updates-on-imagestream-changes.html
image.openshift.io/triggers: >-
[{"from":{"kind":"ImageStreamTag","name":"pushgateway:{{ deployment }}"},"fieldPath":"spec.template.spec.containers[?(@.name==\"pushgateway\")].image"}]
spec:
selector:
matchLabels:
Expand All @@ -20,7 +24,7 @@ spec:
spec:
containers:
- name: pushgateway
image: ghcr.io/zapier/prom-aggregation-gateway:v0.7.0
image: pushgateway:{{ deployment }}
args:
- "--apiListen=:9091"
imagePullPolicy: IfNotPresent
Expand Down Expand Up @@ -54,3 +58,20 @@ spec:
targetPort: 9091
selector:
component: pushgateway
---
kind: ImageStream
apiVersion: image.openshift.io/v1
metadata:
name: pushgateway
spec:
tags:
- name: {{ deployment }}
from:
kind: DockerImage
name: ghcr.io/zapier/prom-aggregation-gateway:v0.7.0
importPolicy:
# Periodically query registry to synchronize tag and image metadata.
scheduled: {{ auto_import_images }}
lookupPolicy:
# allows all resources pointing to this image stream to use it in the image field
local: true
2 changes: 1 addition & 1 deletion playbooks/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
# project_dir is set in tasks/project-dir.yml
path_to_secrets: "{{ project_dir }}/secrets/{{ service }}/{{ deployment }}"
# to be used in Image streams as importPolicy:scheduled value
auto_import_images: "{{(deployment != 'prod')}}"
auto_import_images: true
# used in dev/zuul deployment to tag & push images to cluster
# https://github.com/packit/deployment/issues/112#issuecomment-673343049
# container_engine: "{{ lookup('pipe', 'command -v podman 2> /dev/null || echo docker') }}"
Expand Down
2 changes: 1 addition & 1 deletion playbooks/import-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
with_fedmsg: true
with_dashboard: true
with_tokman: true
with_fluentd_sidecar: true
with_fluentd_sidecar: false
tasks:
- name: Include variables
ansible.builtin.include_vars: ../vars/{{ service }}/{{ deployment }}.yml
Expand Down
5 changes: 2 additions & 3 deletions secrets/packit/prod/packit-service.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,12 @@ enabled_projects_for_internal_tf:
command_handler: sandcastle
command_handler_work_dir: /tmp/sandcastle
command_handler_image_reference: quay.io/packit/sandcastle:prod
command_handler_k8s_namespace: packit-prod-sandbox
command_handler_k8s_namespace: packit--prod-sandbox
command_handler_pvc_volume_specs:
- path: /repository-cache
pvc_from_env: SANDCASTLE_REPOSITORY_CACHE_VOLUME
read_only: true
# [TODO]: Switch to <aws-ebs> during migration of prod to MP+
command_handler_storage_class: gp2
command_handler_storage_class: aws-ebs

repository_cache: /repository-cache
# The maintenance of the cache (adding, updating) is done externally,
Expand Down
2 changes: 1 addition & 1 deletion vars/fedora-source-git/prod_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ with_pushgateway: false

with_repository_cache: false

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down
8 changes: 6 additions & 2 deletions vars/packit/prod_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
project: packit-prod

# Openshift cluster url
host: https://api.auto-prod.gi0n.p1.openshiftapps.com:6443
# For the URL of the MP+ API endpoint, see Bitwarden Secure note
host: ‹TODO›

# oc login <the above host value>, oc whoami -t
# OR via Openshift web GUI: click on your login in top right corner, 'Copy Login Command', take the part after --token=
Expand Down Expand Up @@ -42,7 +43,7 @@ with_flower: true
# with_repository_cache: true
# repository_cache_storage: 4Gi

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down Expand Up @@ -70,6 +71,9 @@ with_fluentd_sidecar: true
# If you still want to use docker even when podman is installed, set:
# container_engine: docker

# We're using 15 on MP+
postgres_version: 15

# Celery retry parameters
# celery_retry_limit: 2
# celery_retry_backoff: 3
Expand Down
2 changes: 1 addition & 1 deletion vars/packit/stg_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ with_flower: true
# with_repository_cache: true
# repository_cache_storage: 4Gi

with_fluentd_sidecar: true
with_fluentd_sidecar: false

# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}
Expand Down
3 changes: 1 addition & 2 deletions vars/template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ api_key: ""

# with_repository_cache: true

# with_fluentd_sidecar: false

with_fluentd_sidecar: false
# image to use for service
# image: quay.io/packit/packit-service:{{ deployment }}

Expand Down