Azure Monitor for Containers

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Release History

Note : The agent version(s) below has dates (ciprod), which indicate the agent build dates (not release dates)

5/19/2022 -

Version microsoft/oms:ciprod05192022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05192022 (linux)

Version microsoft/oms:win-ciprod05192022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05192022 (windows)

Code change log

Linux Agent
- PodReadyPercentage metric bug fix
- add cifs & fuse file systems to ignore list
- CA Cert Fix for Mariner Hosts in Air Gap
- Disk usage metrics will no longer be collected for the paths "/mnt/containers" and "/mnt/docker"
Windows Agent
- Ruby version upgrade from 2.6.5.1 to 2.7.5.1
- Added Support for Windows Server 2022
- Multi-Arch Image to support both Windows 2019 and Windows 2022
Common (Linux & Windows Agent)
- Telegraf version update from 1.20.3 to 1.22.2 to fix the vulnerabilitis
- Removal of Health feature as part of deprecation plan
- AAD Auth MSI feature support for Arc K8s (not usable externally yet)
- MSI onboarding ARM template updates for both AKS & Arc K8s
- Fixed the bug related to windows metrics in MSI mode for AKS
- Configmap updates for log collection settings for v2 schema
Misc
- Improvements related to CI/CD Multi-arc image
  - Do trivy rootfs checks
  - Disable push to ACR for PR and PR updates
  - Enable batch builds
  - Scope Dev/Prod pipelines to respective branches
  - Shorten datetime component of image tag
- Troubleshooting script updates for MSI onboarding
- Instructions for testing of agent in MSI auth mode
- Add CI Windows Build to MultiArch Dev pipeline
- Updates related to building of Multi-arc image for windows in Build Pipeline and local dev builds
- Test yamls to test container logs and prometheus scraping on both WS2019 & WS2022
- Arc K8s conformance test updates
- Script to collect the Agent logs for troubleshooting
- Force run trivy stage for Linux
- Fix docker msi download link in windows install-build-pre-requisites.ps1 script
- Added Onboarding templates for legacy auth for internal testing
- Update the Build pipelines to have separate phase for Windows

3/17/2022 -

Version microsoft/oms:ciprod03172022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03172022 (linux)

Version microsoft/oms:win-ciprod03172022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod03172022 (windows)

Code change log

Linux Agent
- Multi-Arch Image to support both AMD64 and ARM64
- Ruby upgraded to version 2.7 from 2.6
- Fix Telegraf Permissions
- Fix ADX bug with database name
- Vulnerability fixes
- MDSD updated to 1.17.0
  - HTTP Proxy support
  - Retries for Log Analytics Ingestion
  - ARM64 support
  - Memory leak fixes for network failure scenario
Windows Agent
- Bug fix for FluentBit stdout and stderr log filtering
Common
- Upgrade Go lang version from 1.14.1 to 1.15.14
- MSI onboarding ARM template update
- AKS HTTP Proxy support
- Go packages upgrade to address vulnerabilities

1/31/2022 -

Version microsoft/oms:ciprod01312022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod01312022 (linux)

Version microsoft/oms:win-ciprod01312022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod01312022 (windows)

Code change log

Linux Agent
- Configurable DB name via configmap for ADX (default DB name:containerinsights)
- Default to cAdvisor port to 10250 and container runtime to Containerd
- Update AgentVersion annotation in yamls (omsagent and chart) with released MDSD agent version
- Incresing windows agent CPU limits from 200m to 500m
- Ignore new disk path that comes from containerd starting with k8s version >= 1.19.x, which was adding unnecessary InsightsMetrics logs and increasing cost
- Route the AI SDK logs to log file instead of stdout
- Telemetry to collect ContainerLog Records with empty Timestamp
- FluentBit version upgrade from 1.6.8 to 1.7.8
Windows Agent
- Update to use FluentBit for container log collection and removed FluentD dependency for container log collection
- Telemetry to track if any of the variable fields of windows container inventory records has field size >= 64KB
- Add windows os check in in_cadvisor_perf plugin to avoid making call in MDSD in MSI auth mode
- Bug fix for placeholder_hostname in telegraf metrics
- FluentBit version upgrade from 1.4.0 to 1.7.8
Common
- Upgrade FluentD gem version from 1.12.2 to 1.14.2
- Upgrade Telegraf version from 1.18.0 to 1.20.3
- Fix for exception in node allocatable
- Telemetry to track nodeCount & containerCount
Other changes
- Updates to Arc K8s Extension ARM Onboarding templates with GA API version
- Added ARM Templates for MSI Based Onboarding for AKS
- Conformance test updates relates to sidecar container
- Troubelshooting script to detect issues related to Arc K8s Extension onboarding
- Remove the dependency SP for CDPX since configured to use MSI
- Linux Agent Image build improvements
- Update msys2 version to fix windows agent build
- Add explicit exit code 1 across all the PS scripts

10/13/2021 -

Version microsoft/oms:ciprod10132021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10132021 (linux)

Version microsoft/oms:win-ciprod10132021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10132021 (windows)

Code change log

Linux Agent
- MDSD Proxy support for non-AKS
- log rotation for mdsd log files {err,warn, info & qos}
- Onboarding status
- AAD Auth MSI changes (not usable externally yet)
- Upgrade k8s and adx go packages to fix vulnerabilities
- Fix missing telegraf metrics (TelegrafMetricsSentCount & TelegrafMetricsSendErrorCount) in mdsd route
- Improve fluentd liveness probe checks to handle both supervisor and worker process
- Fix telegraf startup issue when endpoint is unreachable
Windows Agent
- Windows liveness probe optimization
Common
- Add new metrics to MDM for allocatable % calculation of cpu and memory usage
Other changes
- Helm chart updates for removal of rbac api version and deprecation of.Capabilities.KubeVersion.GitVersion to .Capabilities.KubeVersion.Version
- Updates to build and release ev2
- Scripts to collect troubleshooting logs
- Unit test tooling
- Yaml updates in parity with aks rp yaml
- upgrade golang version for windows in pipelines
- Conformance test updates

09/02/2021 -

Version microsoft/oms:ciprod08052021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1 (linux)

Code change log

Bumping image tag for some tooling (no code changes except the IMAGE_TAG environment variable)

08/05/2021 -

Version microsoft/oms:ciprod08052021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021 (linux)

Code change log

Linux Agent
- Fix for CPU spike which occurrs at around 6.30am UTC on every day because of unattended package upgrades
- Update MDSD build which has fixes for the following issues
  - Undeterministic Core dump issue because of the non 200 status code and runtime exception stack unwindings
  - Reduce the verbosity of the error logs for OMS & ODS code paths.
  - Increase Timeout for OMS Homing service API calls from 30s to 60s
- Fix for Azure/AKS#2457
- In replicaset, tailing of the mdsd.err log file to agent telemetry

07/13/2021 -

Version microsoft/oms:win-ciprod06112021-2 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021-2 (windows)

Code change log

Hotfix for fixing NODE_IP environment variable not set issue for non sidecar mode

07/02/2021 -

Version microsoft/oms:ciprod06112021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1 (linux)

Version microsoft/oms:win-ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021 (windows)

Code change log

Hotfix for crash in clean_cache in in_kube_node_inventory plugin
We didn't rebuild windows container, so the image version for windows container stays the same as last release (ciprod:win-ciprod06112021) before this hotfix

06/11/2021 -

Version microsoft/oms:ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021 (linux)

Version microsoft/oms:win-ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021 (windows)

Linux Agent
- Removal of base omsagent dependency
- Using MDSD version 1.10.1 as base agent for all the supported LA data types
- Ruby version upgrade to 2.6 i.e. same version as windows agent
- Upgrade FluentD gem version to 1.12.2
- All the Ruby Fluentd Plugins upgraded to v1 as per Fluentd guidance
- Fluent-bit tail plugin Mem_Buf_limit is configurable via ConfigMap
Windows Agent
- CA cert changes for airgapped clouds
- Send perf metrics to MDM from windows daemonset
- FluentD gem version upgrade from 1.10.2 to 1.12.2 to make same version as Linux Agent
Doc updates
README updates related to OSM preview release for Arc K8s
README updates related to recommended alerts

05/20/2021 -

Version microsoft/oms:ciprod05202021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021 (linux)

No Windows changes with this release, win-ciprod04222021 still current.

Code change log

Telegraf now waits 30 seconds on startup for network connections to complete (Linux only)
Change adding telegraf to the liveness probe reverted (Linux only)

05/12/2021 -

Version microsoft/oms:ciprod05122021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05122021 (linux)

No Windows changes with this release, win-ciprod04222021 still current.

Code change log

Upgrading oneagent to version 1.8 (only for Linux)
Enabling oneagent for container logs for East US 2

04/22/2021 -

Version microsoft/oms:ciprod04222021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod04222021 (linux)

Version microsoft/oms:win-ciprod04222021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod04222021 (windows)

Code change log

Bug fixes for metrics cpuUsagePercentage and memoryWorkingSetPercentage for windows nodes
Added metrics for threshold violation
Made Job completion metric configurable
Udated default buffer sizes in fluent-bit
Updated recommended alerts
Fixed bug where logs written before agent starts up were not collected
Fixed bug which kept agent logs from being rotated
Bug fix for Windows Containerd container log collection
Bug fixes
Doc updates
Minor telemetry changes

03/26/2021 -

Version microsoft/oms:ciprod03262021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03262021 (linux)

Version microsoft/oms:win-ciprod03262021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod03262021 (windows)

Code change log

Started collecting new metric - kubelet running pods count
Onboarding script fixes to add explicit json output
Proxy and token updates for ARC
Doc updates for Microsoft charts repo release
Bug fixes for trailing whitespaces in enable-monitoring.sh script
Support for higher volume of prometheus metrics by scraping metrics from sidecar
Update to get new version of telegraf - 1.18
Add label and field selectors for prometheus scraping using annotations
Support for OSM integration
Removed wireserver calls to get CA certs since access is removed
Added liveness timeout for exec for linux containers

02/23/2021 -

Version microsoft/oms:ciprod02232021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod02232021 (linux)

Version microsoft/oms:win-ciprod02232021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod02232021 (windows)

Code change log

ContainerLogV2 schema support for LogAnalytics & ADX (not usable externally yet)
Fix nodemetrics (cpuusageprecentage & memoryusagepercentage) metrics not flowing. This is fixed upstream for k8s versions >= 1.19.7 and >=1.20.2.
Fix cpu & memory usage exceeded threshold container metrics not flowing when requests and/or limits were not set
Mute some unused exceptions from going to telemetry
Collect containerimage (repository, image & imagetag) from spec (instead of runtime)
Add support for extension MSI for k8s arc
Use cloud specific instrumentation keys for telemetry
Picked up newer version for apt
Add priority class to daemonset (in our chart only)

01/11/2021 -

Version microsoft/oms:ciprod01112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod01112021 (linux)

Version microsoft/oms:win-ciprod01112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod01112021 (windows)

Code change log

Fixes for Linux Agent Replicaset Pod OOMing issue
Update fluentbit (1.14.2 to 1.6.8) for the Linux Daemonset
Make Fluentbit settings: log_flush_interval_secs, tail_buf_chunksize_megabytes and tail_buf_maxsize_megabytes configurable via configmap
Support for PV inventory collection
Removal of Custom metric region check for Public cloud regions and update to use cloud environment variable to determine the custom metric support
For daemonset pods, add the dnsconfig to use ndots: 3 from ndots:5 to optimize the number of DNS API calls made
Fix for inconsistency in the collection container environment variables for the pods which has high number of containers
Fix for disabling of std{out;err} log_collection_settings via configmap issue in windows daemonset
Update to use workspace key from mount file rather than environment variable for windows daemonset agent
Remove per container info logs in the container inventory
Enable ADX route for windows container logs
Remove logging to termination log in windows agent liveness probe

11/09/2020 -

Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod11092020 (linux)

Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod11092020 (windows)

Code change log

Fix for duplicate windows metrics

10/27/2020 -

Version microsoft/oms:ciprod10272020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10272020 (linux)

Version microsoft/oms:win-ciprod10272020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10272020 (windows)

Code change log

Activate oneagent in few AKS regions (koreacentral,norwayeast)
Disable syslog
Fix timeout for Windows daemonset liveness probe
Make request == limit for Windows daemonset resources (cpu & memory)
Schema v2 for container log (ADX only - applicable only for select customers for piloting)

10/05/2020 -

Version microsoft/oms:ciprod10052020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10052020 (linux)

Version microsoft/oms:win-ciprod10052020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10052020 (windows)

Code change log

Health CRD to version v1 (from v1beta1) for k8s versions >= 1.19.0
Collection of PV usage metrics for PVs mounted by pods (kube-system pods excluded by default)(doc-link-needed)
Zero fill few custom metrics under a timer, also add zero filling for new PV usage metrics
Collection of additional Kubelet metrics ('kubelet_running_pod_count','volume_manager_total_volumes','kubelet_node_config_error','process_resident_memory_bytes','process_cpu_seconds_total','kubelet_runtime_operations_total','kubelet_runtime_operations_errors_total'). This also includes updates to 'kubelet' workbook to include these new metrics
Collection of Azure NPM (Network Policy Manager) metrics (basic & advanced. By default, NPM metrics collection is turned OFF)(doc-link-needed)
Support log collection when docker root is changed with knode. Tracked by this issue
Support for Pods in 'Terminating' state for nodelost scenarios
Fix for reduction in telemetry for custom metrics ingestion failures
Fix CPU capacity/limits metrics being 0 for Virtual nodes (VK)
Add new custom metric regions (eastus2,westus,australiasoutheast,brazilsouth,germanywestcentral,northcentralus,switzerlandnorth)
Enable strict SSL validation for AppInsights Ruby SDK
Turn off custom metrics upload for unsupported cluster types
Install CA certs from wire server for windows (in certain clouds)

09/16/2020 -

Note: This agent release targetted ONLY for non-AKS clusters via Azure Monitor for containers HELM chart update

Version microsoft/oms:ciprod09162020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod09162020 (linux)

Version microsoft/oms:win-ciprod09162020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod09162020 (windows)

Code change log

Collection of Azure Network Policy Manager Basic and Advanced metrics
Add support in Windows Agent for Container log collection of CRI runtimes such as ContainerD
Alertable metrics support Arc K8s cluster to parity with AKS
Support for multiple container log mount paths when docker is updated through knode
Bug fix related to MDM telemetry

08/07/2020 -

Version microsoft/oms:ciprod08072020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08072020 (linux)

Version microsoft/oms:win-ciprod08072020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod08072020 (windows)

Code change log

Collection of KubeState metrics for deployments and HPA
Add the Proxy support for Windows agent
Fix for ContainerState in ContainerInventory to handle Failed state and collection of environment variables for terminated and failed containers
Change /spec to /metrics/cadvisor endpoint to collect node capacity metrics
Disable Health Plugin by default and can enabled via configmap
Pin version of jq to 1.5+dfsg-2
Bug fix for showing node as 'not ready' when there is disk pressure
oneagent integration (disabled by default)
Add region check before sending alertable metrics to MDM
Telemetry fix for agent telemetry for sov. clouds

07/15/2020 -

Version microsoft/oms:ciprod07152020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod07152020 (linux)

Version microsoft/oms:win-ciprod05262020-2 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05262020-2 (windows)

Code change log

Following hotfixes which are applicable only for Linux agent
- Fix the issue related to collection of multi-containers in pod for the ContainerInventory table
- Fix the containerhostname field value to have podname rather than nodename in ContainerInventory table
- Fix OOM issue during container startup if there are high number of pods or containers on the node
- Fix the ContainerName field value same as before in ContainerInventory table
We didn't rebuild windows container, so the image version for windows container stays the same as last release (ciprod:win-ciprod05262020-2) before this hotfix

06/30/2020 -

Version microsoft/oms:ciprod06302020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06302020 (linux)

Version microsoft/oms:win-ciprod05262020-2 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05262020-2 (windows)

Code change log

Hotfix for nested JSON log parsing bug (applicable only to Linux Daemonset)
We didn't rebuild windows container, so the image version for windows container stays the same as last release (ciprod:win-ciprod05262020-2) before this hotfix

05/27/2020 -

Version microsoft/oms:win-ciprod05262020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05262020-2 (windows)

Code change log

Update application insights instrumentation key for windows image to point to the production instance

05/22/2020 -

Version microsoft/oms:ciprod05222020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05222020 (linux)

Version microsoft/oms:win-ciprod05222020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05222020 (windows)

Code change log

Windows Daemonset - Collection of Windows std/stderr logs
More Alerable Metrics (going to Metrics Store/custom metrics - see Customer Impact section below for metrics list)
Fix OOM-ing at high prometheus scrape volume
Update fluentbit (0.14.4 to 1.4.2)
Drop non-numeric metrics thru Telegraf
Reduce Health exception (when API server response is nil)
Add 'Computer' dimension to all telemetry (internal use)
Support for specifiying HTTP & HTTPS Proxy for outbound/egress (applicable only for non-AKS clusters)
Move to rbac.authorization.k8s.io/v1 for ClusterRole & ClusterRoleBinding
Move to apiextensions.k8s.io/v1 for Health CRD

Customer Impact

Windows Logs - Customers will see agent automatically start collecting windows container STDOUT/STDERR logs sending them to same loganaytics workspace (containerlogs table)
Alertable metrics - Customers will see the below metrics & namespaces in 'Metrics' TOC for AKS clusters
- Metrics
  - diskUsagePercentage
  - completedJobsCount
  - oomKilledContainerCount
  - podReadyPercentage
  - restartingContainerCount
  - cpuExceededPercentage
  - memoryRssExceededPercentage
  - memoryWorkingSetExceededPercentage
- Metric Namespaces
  - insights.container/containers
HTTP/S Proxy support - For non-AKS clusters, proxy can be configured when installing thru HELM. Please see documentation for more details

04/16/2020 -

Note: This agent release targetted ONLY for non-AKS clusters via Azure Monitor for containers HELM chart update

Version microsoft/oms:ciprod04162020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod04162020

Code change log

Add support for rate limiting
Add support for Container Runtime Interface compatible container runtime(s) like CRI-O and ContainerD
- cAdvisor APIs are used to collect the container inventory for Docker/Moby and CRI runtime K8s environments
- Based on the container runtime, corresponding container log FluentBit parser(docker/cri) selected

Customer Impact

Ingestion will throttle the workspaces if the agent on the cluster sending the beyond Log Analytics Workspace throttling limits i.e. 500 MB/s
On Docker runtime environments, Inventory of the containers obtained earlier via Docker REST API. Agent now uses the cAdvisor APIs to get the inventory of the containers for Docker and non-Docker container runtime environments.

03/02/2020 -

Version microsoft/oms:ciprod03022020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03022020

Code change log

Collection of GPU metrics as InsightsMetrics
Enable config map settings to enable collection of 'Normal' kube events
Fix kubehealth exceptions to handle empty/nil kube api responses
Get resource limits for health and MDM from kubelet instead of kube api
Bug fix for windows node image collection where image name contains multiple slashes
Exclude ARO master node for data collection
Telemetry for kube events flushed
Changes to support msi for mdm if service principal doesnt exist
Changes for AKS telemetry to ping ods endpoint first and then network check
KubeEvents bug fix for KubeEvent type

Customer Impact

Providing capability for customers to collect 'Normal' kube events using config map
Metrics for GPU are collected and ingested to customers workspace if they have GPU enabled nodes
Bug fix for windows container image collection allows customers to get the right data in the ContainerInventory table for windows containers.

01/07/2020 -

Version microsoft/oms:ciprod01072020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod01072020

Code change log

Switch between 10255(old) and 10250(new) ports for cadvisor for older and newer versions of kubernetes

Customer Impact

Node cpu, node memory, container cpu and container memory metrics were obtained earlier by querying kubelet readonly port(http://$NODE_IP:10255). Agent now supports getting these metrics from kubelet port(https://$NODE_IP:10250) as well. During the agent startup, it checks for connectivity to kubelet port(https://$NODE_IP:10250), and if it fails the metrics source is defaulted to readonly port(http://$NODE_IP:10255).

12/04/2019 -

Version microsoft/oms:ciprod12042019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod12042019

Fix scheduler for all input plugins
Fix liveness probe
Reduce chunk sizes for all fluentD buffers to support larger clusters (nodes & pods)
Chunk Kubernetes API calls (pods,nodes,events)
Use HTTP.start instead of HTTP.new
Merge KubePerf into KubePods & KubeNodes
Merge KubeServices into KubePod
Use stream based yajl for JSON parsing
Health - Query only kube-system pods
Health - Use keep_if instead of select
Container log enrichment (turned OFF by default for ContainerName & ContainerImage)
Application Insights Telemetry - Async
Fix metricTime to be batch time for all metric input plugins
Close socket connections properly for DockerAPIClient
Fix top un handled exceptions in Kubernetes API Client and pod inventory
Fix retries, wait between retries, chunk size, thread counts to be consistent for all FluentD workflows
Back-off for containerlog enrichment K8S API calls
Add new regions (3) for Azure Monitor Custom metrics
Increase the cpu(1 core) & memory(750Mi) limits for replica-set to support larger clusters (nodes & pods)
Move to Ubuntu 18.04 LTS
Support for Kubernetes 1.16
Use ifconfig for detecting network connectivity issues
Collect eventType != Normal

10/11/2019 -

Version microsoft/oms:ciprod10112019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10112019

Update prometheus config scraping capability to restrict collecting metrics from pods in specific namespaces.
Feature to send custom configuration/prometheus scrape errors to KubeMonAgentEvents table in customer's workspace.
Bug fix to collect data for init containers for Container Logs, KubePodInventory and Perf.
Bug fix for empty array being a valid setting in custom config in configmap.
Restrict kubelet_docker_operations and kubelet_docker_operations_errors to create_containers, remove_containers and pull_image operations.
Fix top exceptions in telemetry

08/22/2019 -

Version microsoft/oms:ciprod08222019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08222019

Cluster Health Private Preview based on config map setting
Update resource requests for replicaset to 110m and 250Mi
Update custom metrics supported regions
Fix for promethus config map telemetry
Telemetry for controller kind
Update url to use one of the whitelisted urls for cp monitor telemetry
Configmap with clusterid for AKS to be used by Application Insights

07/09/2019 -

Version microsoft/oms:ciprod07092019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod07092019

Prometheus custom metric collection using config map allowing omsagent to
- Scrape metrics from user defined urls
- Scrape kubernetes pods with prometheus annotations
- Scrape metrics from kubernetes services
Exception fixes in daemonset and replicaset
Container Inventory plugin changes to get image id from the repo digest and populate repository for image with only image digest
Remove telegraf errors from being sent to ApplicationInsights and instead log it to stderr to provide visibility for customers
Bug fixes for region names with spaces being processed incorrectly while sending mdm metrics
Add log size in telemetry
Remove buffer chunk size and buffer max size from fluentbit configuration

06/14/2019 -

Version microsoft/oms:ciprod06142019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06142019

MDM pod metrics bug fixes - MDM rejecting pod metrics due to nodename or controllername dimensions being empty
Prometheus metrics collection by default in every node for kubelet docker operations and kubelet docker operation errors
Telegraf metric collection for diskio and networkio metrics
Agent Configuration/ Settings for data collection
- Cluster level log collection enable/disable option
- Ability to enable/disable stdout and/or stderr logs collection per namespace
- Cluster level environment variable collection enable/disable option
- Config file version & config schema version
- Pod annotation for supported config schema version(s)
Log collection optimization/tuning for better performance
- Derive k8s namespaces from log file name (instead of making call to k8s api service)
- Do not tail log files for containers in the excluded namespace list (if excluded both in stdout & stderr)
- Limit buffer size to 1M and flush logs more frequently [every 10 secs (instead of 30 secs)]
- Tuning of several other fluent bit settings
Increase requests

Replica set memory request by 75M (100M to 175M)
Daemonset CPU request by 25m (50m to 75m)

Will be pushing image only to MCR ( no more Docker) starting this release. AKS-engine will also start to pull our agent image from MCR

04/23/2019 -

Version microsoft/oms:ciprod043232019 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod04232019

Windows node monitoring (metrics & inventory)
Telegraf integration (Telegraf metrics to LogAnalytics)
Node Disk usage metrics (used, free, used%) as InsightsMetrics
Resource stamping for all types (inventory, metrics (perf), metrics (InsightsMetrics), logs) [Applicable only for AKS clusters]
Upped daemonset memory request (not limit) from 150Mi to 225 Mi
Added liveness probe for fluentbit
Fix for MDM filter plugin when kubeapi returns non-200 response

03/12/2019 - Version microsoft/oms:ciprod03122019

Fix for closing response.Body in outoms
Update Mem_Buf_Limit to 5m for fluentbit
Tail only files that were modified since 5 minutes
Remove some unwanted logs that are chatty in outoms
Fix for MDM disablement for AKS-Engine
Fix for Pod count metric (same as container count) in MDM

02/21/2019 - Version microsoft/oms:ciprod02212019

Container logs enrichment optimization
- Get container meta data only for containers in current node (vs cluster before)
Update fluent bit 0.13.7 => 0.14.4
- This fixes the escaping issue in the container logs
Mooncake cloud support for agent (AKS only)
- Ability to disable agent telemetry
- Ability to onboard and ingest to mooncake cloud
Add & populate 'ContainerStatusReason' column to KubePodInventory
Alertable (custom) metrics (to AzureMonitor - only for AKS clusters)
- Cpuusagenanocores & % metric
- MemoryWorkingsetBytes & % metric
- MemoryRssBytes & % metric
- Podcount by node, phase & namespace metric
- Nodecount metric
ContainerNodeInventory_CL to fixed type

01/09/2018 - Version microsoft/oms:ciprod01092019

Omsagent - 1.8.1.256 (nov 2018 release)
Persist fluentbit state between container restarts
Populate 'TimeOfCommand' for agent ingest time for container logs
Get node cpu usage from cpuusagenanoseconds (and convert to cpuusgaenanocores)
Container Node Inventory - move to fluentD from OMI
Mount docker.sock (Daemon set) as /var/run/host
Add omsagent user to docker group
Move to fixed type for kubeevents & kubeservices
Disable collecting ENV for our oms agent container (daemonset & replicaset)
Disable container inventory collection for 'sandbox' containers & non kubernetes managed containers
Agent telemetry - ContainerLogsAgentSideLatencyMs
Agent telemetry - PodCount
Agent telemetry - ControllerCount
Agent telemetry - K8S Version
Agent telemetry - NodeCoreCapacity
Agent telemetry - NodeMemoryCapacity
Agent telemetry - KubeEvents (exceptions)
Agent telemetry - Kubenodes (exceptions)
Agent telemetry - kubepods (exceptions)
Agent telemetry - kubeservices (exceptions)
Agent telemetry - Daemonset , Replicaset as dimensions (bug fix)

11/29/2018 - Version microsoft/oms:ciprod11292018

Disable Container Image inventory workflow
Kube_Events memory leak fix for replica-set
Timeout (30 secs) for outOMS
Reduce critical lock duration for quicker log processing (for log enrichment)
Disable OMI based Container Inventory workflow to fluentD based Container Inventory
Moby support for the new Container Inventory workflow
Ability to disable environment variables collection by individual container
Bugfix - No inventory data due to container status(es) not available
Agent telemetry cpu usage & memory usage (for DaemonSet and ReplicaSet)
Agent telemetry - log generation rate
Agent telemetry - container count per node
Agent telemetry - collect container logs from agent (DaemonSet and ReplicaSet) as AI trace
Agent telemetry - errors/exceptions for Container Inventory workflow
Agent telemetry - Container Inventory Heartbeat

10/16/2018 - Version microsoft/oms:ciprod10162018-2

Fix for containerID being 00000-00000-00000
Move from fluentD to fluentbit for container log collection
Seg fault fixes in json parsing for container inventory & container image inventory
Telemetry enablement
Remove ContainerPerf, ContainerServiceLog, ContainerProcess fluentd-->OMI workflows
Update log level for all fluentD based workflows

7/31/2018 - Version microsoft/oms:ciprod07312018

Changes for node lost scenario (roll-up pod & container statuses as Unknown)
Discover unscheduled pods
KubeNodeInventory - delimit multiple true node conditions for node status
UTF Encoding support for container logs
Container environment variable truncated to 200K
Handle json parsing errors for OMI provider for docker
Test mode enablement for ACS-engine testing
Latest OMS agent (1.6.0-163)
Latest OMI (1.4.2.5)

6/7/2018 - Version microsoft/oms:ciprod06072018

Remove node-0 dependency
Remove passing WSID & Key as environment variables and pass them as kubernetes secret (for non-AKS; we already pass them as secret for AKS)
Please note that if you are manually deploying thru yaml you need to -
Provide workspaceid & key as base64 encoded strings with in double quotes (.yaml has comments to do so as well)
Provide cluster name twice (for each container – daemonset & replicaset)

5/8/2018 - Version microsoft/oms:ciprod05082018

Kubernetes RBAC enablement
Latest released omsagent (1.6.0-42)
Bug fix so that we do not collect kube-system namespace container logs when kube api calls fail occasionally (Bug #215107)
.yaml changes (for RBAC)

Files

ReleaseNotes.md

Latest commit

History

ReleaseNotes.md

File metadata and controls

Azure Monitor for Containers

Code of Conduct

Release History

5/19/2022 -

Version microsoft/oms:ciprod05192022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05192022 (linux)

Version microsoft/oms:win-ciprod05192022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod05192022 (windows)

Code change log

3/17/2022 -

Version microsoft/oms:ciprod03172022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03172022 (linux)

Version microsoft/oms:win-ciprod03172022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod03172022 (windows)

Code change log

1/31/2022 -

Version microsoft/oms:ciprod01312022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod01312022 (linux)

Version microsoft/oms:win-ciprod01312022 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod01312022 (windows)

Code change log

10/13/2021 -

Version microsoft/oms:ciprod10132021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10132021 (linux)

Version microsoft/oms:win-ciprod10132021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10132021 (windows)

Code change log

09/02/2021 -

Version microsoft/oms:ciprod08052021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021-1 (linux)

Code change log

08/05/2021 -

Version microsoft/oms:ciprod08052021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod08052021 (linux)

Code change log

07/13/2021 -

Version microsoft/oms:win-ciprod06112021-2 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021-2 (windows)

Code change log

07/02/2021 -

Version microsoft/oms:ciprod06112021-1 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021-1 (linux)

Version microsoft/oms:win-ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021 (windows)

Code change log

06/11/2021 -

Version microsoft/oms:ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod06112021 (linux)

Version microsoft/oms:win-ciprod06112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod06112021 (windows)

05/20/2021 -

Version microsoft/oms:ciprod05202021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05202021 (linux)

No Windows changes with this release, win-ciprod04222021 still current.

Code change log

05/12/2021 -

Version microsoft/oms:ciprod05122021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod05122021 (linux)

No Windows changes with this release, win-ciprod04222021 still current.

Code change log

04/22/2021 -

Version microsoft/oms:ciprod04222021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod04222021 (linux)

Version microsoft/oms:win-ciprod04222021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod04222021 (windows)

Code change log

03/26/2021 -

Version microsoft/oms:ciprod03262021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod03262021 (linux)

Version microsoft/oms:win-ciprod03262021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod03262021 (windows)

Code change log

02/23/2021 -

Version microsoft/oms:ciprod02232021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod02232021 (linux)

Version microsoft/oms:win-ciprod02232021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod02232021 (windows)

Code change log

01/11/2021 -

Version microsoft/oms:ciprod01112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod01112021 (linux)

Version microsoft/oms:win-ciprod01112021 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod01112021 (windows)

Code change log

11/09/2020 -

Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod11092020 (linux)

Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod11092020 (windows)

Code change log

10/27/2020 -

Version microsoft/oms:ciprod10272020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10272020 (linux)

Version microsoft/oms:win-ciprod10272020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10272020 (windows)

Code change log

10/05/2020 -

Version microsoft/oms:ciprod10052020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10052020 (linux)

Version microsoft/oms:win-ciprod10052020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10052020 (windows)

Code change log

09/16/2020 -

Version microsoft/oms:ciprod09162020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod09162020 (linux)

Version microsoft/oms:win-ciprod09162020 Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod09162020 (windows)