Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/observability: scripts and doc updates #65

Merged
merged 6 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Apart from the above arrangement, the following system modules/pods are part of

This section describes the nested topology design implemented by this solution.

![alt text](architecture/nested-topology-hld-envoy.png "Nested Toplogy")
![alt text](architecture/nested-topology-hld.png "Nested Toplogy")

At the core of the nested topology design, we have reverse proxies which broker the connections between each hypothetical ISA-95 level (Level 2,3,4 in this instance). These proxies prevent workloads and Arc agents running at lower levels from connecting to the outside world directly, allowing the traffic to be managed or controlled via proxy configuration at each level. Currently, data plane is traversing layers directly between brokers, and we are evaluating an improvement to force this communication to pass through the proxy transparently.
Proxying of allowed URI calls from the lower L2 and L3 levels for the AKS host nodes (kubelet, containerd) is implemented using a DNS Server override in each lower Virtual Network.
Expand Down Expand Up @@ -68,6 +68,10 @@ Workloads exchange messages locally on the same network layer and Kubernetes clu

For more information about MQTT broker choice and comparison, please see [MQTT Broker for Data Communication Between Workloads and Between Network Layers](/docs/mqttbroker.md).

### Monitoring and Observability

Gathering diverse signals from sources such as operating systems, data components, custom workloads, and the Kubernetes platform itself, as well as analyzing these is discussed in a separate document [Observability for Distributed Edge](./docs/observability.md).

## Solution Deployment

![alt text](architecture/deployment-hld.png "Deployment Strategy")
Expand Down
Binary file modified architecture/nested-topology-hld.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added architecture/observability-hld-slide.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added architecture/observability-stacked.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,9 @@ param aksObjectId string
@description('Wether to close down outbound internet access')
param closeOutboundInternetAccess bool = false

@description('Provision monitoring')
param provisionMonitoring bool = false

var applicationNameWithoutDashes = replace(applicationName, '-', '')
var aksName = take('aks-${applicationNameWithoutDashes}', 20)
var resourceGroupName = applicationName
Expand Down Expand Up @@ -131,6 +134,16 @@ module downstreamvnetpeering 'modules/vnetpeering.bicep' = if (!empty(remoteVnet
]
}

module monitoring 'modules/loganalytics.bicep' = if (provisionMonitoring) {
scope: resourceGroup(rg.name)
name: 'monitoringDeployment'
params: {
workspaceAccountName: applicationName
monitorAccountLocation: location
}
}

output aksName string = aksName
output aksResourceGroup string = resourceGroupName
output subnetId string = vnet.outputs.subnetId
output appInsightsInstrumentationKey string = provisionMonitoring ? monitoring.outputs.instrumentationKey : ''
2 changes: 1 addition & 1 deletion deployment/bicep/modules/azurestorage.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
// ------------------------------------------------------------
@minLength(3)
@maxLength(24)
@description('Azure Stroage Account name which is not already in use.')
@description('Azure Storage Account name which is not already in use.')
param storageAccountName string

@description('Storage account location')
Expand Down
50 changes: 50 additions & 0 deletions deployment/bicep/modules/loganalytics.bicep
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
// ------------------------------------------------------------
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License (MIT). See License.txt in the repo root for license information.
// ------------------------------------------------------------
@minLength(3)
@maxLength(24)
@description('Azure Log Analytics and App Insights Account name which is not already in use.')
param workspaceAccountName string

@description('Azure Log Analytics account location')
@maxLength(20)
param monitorAccountLocation string = resourceGroup().location

resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
name: workspaceAccountName
location: monitorAccountLocation
properties: {
sku: {
name: 'PerGB2018'
}
retentionInDays: 30
features: {
enableLogAccessUsingOnlyResourcePermissions: true
}
workspaceCapping: {
dailyQuotaGb: -1
}
publicNetworkAccessForIngestion: 'Enabled'
publicNetworkAccessForQuery: 'Enabled'
}
}

resource appInsightsComponent 'microsoft.insights/components@2020-02-02' = {
name: workspaceAccountName
location: monitorAccountLocation
kind: 'web'
properties: {
Application_Type: 'web'
Flow_Type: 'Bluefield'
Request_Source: 'rest'
RetentionInDays: 90
WorkspaceResourceId: logAnalytics.id
IngestionMode: 'LogAnalytics'
publicNetworkAccessForIngestion: 'Enabled'
publicNetworkAccessForQuery: 'Enabled'
}
}

output instrumentationKey string = appInsightsComponent.properties.InstrumentationKey

21 changes: 17 additions & 4 deletions deployment/bicep/modules/vnet.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ param closeOutboundInternetAccess bool = false
var subnetName = aksName
var subnetNsgName = aksName

var arrayBasicRules = [ allowProxyInboundSecurityRule, allowMqttSslInboundSecurityRule ]
var arrayBaseAndLockRules = [ allowProxyInboundSecurityRule, allowMqttSslInboundSecurityRule, allowK8ApiHTTPSOutbound, allowK8ApiUdpOutbound, allowTagAks9000Outbound, allowTagFrontDoorFirstParty, allowTagMcr, denyOutboundInternetAccessSecurityRule ]
var arrayBasicRules = [ allowProxyInboundSecurityRule, allowMqttSslInboundSecurityRule, allowOtelGrpcInboundSecurityRule ]
var arrayBaseAndLockRules = [ allowProxyInboundSecurityRule, allowMqttSslInboundSecurityRule, allowOtelGrpcInboundSecurityRule, allowK8ApiHTTPSOutbound, allowK8ApiUdpOutbound, allowTagAks9000Outbound, allowTagFrontDoorFirstParty, allowTagMcr, denyOutboundInternetAccessSecurityRule ]

// TODO: We need to do this is nested manner e.g. use parent vnet/subnet if this is nested vnet/subnet creation.
var allowProxyInboundSecurityRule = {
Expand All @@ -41,15 +41,14 @@ var allowProxyInboundSecurityRule = {
priority: 1010
access: 'Allow'
direction: 'Inbound'
destinationPortRange: '443'
destinationPortRanges: ['443', '8084']
protocol: 'Tcp'
sourcePortRange: '*'
sourceAddressPrefix: 'VirtualNetwork'
destinationAddressPrefix: 'VirtualNetwork'
}
}

// TODO: potentially remove this if going through proxy, for now setup for testing MQTT bridging
var allowMqttSslInboundSecurityRule = {
name: 'AllowMqttSsl'
properties: {
Expand All @@ -64,6 +63,20 @@ var allowMqttSslInboundSecurityRule = {
}
}

var allowOtelGrpcInboundSecurityRule = {
name: 'AllowOtelGrpc'
properties: {
priority: 1030
access: 'Allow'
direction: 'Inbound'
destinationPortRange: '4318'
protocol: 'Tcp'
sourcePortRange: '*'
sourceAddressPrefix: 'VirtualNetwork'
destinationAddressPrefix: 'VirtualNetwork'
}
}

var allowK8ApiHTTPSOutbound = {
name: 'AllowK8ApiHTTPSOutbound'
properties: {
Expand Down
12 changes: 11 additions & 1 deletion deployment/build-and-deploy-images.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,11 @@ Param(
# leave empty if both workloads are deployed on single cluster L4
[string]
[Parameter(mandatory=$false)]
$L2ResourceGroupName
$L2ResourceGroupName,

[Parameter(Mandatory = $false)]
[bool]
$SetupObservability = $true
)

if(!$env:RESOURCEGROUPNAME -and !$AppResourceGroupName)
Expand Down Expand Up @@ -56,6 +60,8 @@ Set-Location -Path $deploymentDir

Write-Title("Upgrade/Install Pod/Containers with Helm charts in Cluster L4")
$datagatewaymoduleimage = $acrName + ".azurecr.io/datagatewaymodule:" + $deploymentId
$observabilityString = ($SetupObservability -eq $true) ? "true" : "false"
$samplingRate = ($SetupObservability -eq $true) ? "1" : "0" # in development we set to 1, in prod should be 0.0001 or similar, 0 turns off observability

# ----- Get Cluster Credentials for L4 layer
Write-Title("Get AKS Credentials L4 Layer")
Expand All @@ -68,6 +74,8 @@ az aks get-credentials `

helm upgrade iot-edge-l4 ./helm/iot-edge-l4 `
--set-string images.datagatewaymodule="$datagatewaymoduleimage" `
--set-string observability.samplingRate="$samplingRate" `
--set observability.enabled=$observabilityString `
--namespace $appKubernetesNamespace `
--reuse-values `
--install
Expand Down Expand Up @@ -96,6 +104,8 @@ helm upgrade iot-edge-l2 ./helm/iot-edge-l2 `
--set-string images.simulatedtemperaturesensormodule="$simtempimage" `
--set-string images.opcplcmodule="$opcplcimage" `
--set-string images.opcpublishermodule="$opcpublisherimage" `
--set observability.enabled=$observabilityString `
--set-string observability.samplingRate="$samplingRate" `
--reuse-values `
--namespace $appKubernetesNamespace `
--install
Expand Down
15 changes: 10 additions & 5 deletions deployment/deploy-az-demo-bootstrapper.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,11 @@ Param(

[string]
[Parameter(mandatory=$false)]
$Location = 'westeurope'
$Location = 'westeurope',

[Parameter(Mandatory = $false)]
[bool]
$SetupObservability = $true
)

mkdir -p modules
Expand All @@ -38,18 +42,19 @@ Invoke-WebRequest -Uri "$baseLocation/deployment/deploy-app-l4.ps1" -OutFile "de

mkdir -p bicep/modules
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/core-infra-aks.bicep" -OutFile "./bicep/core-infra-aks.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/core-infra-vnet.bicep" -OutFile "./bicep/core-infra-vnet.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/core-infra-base.bicep" -OutFile "./bicep/core-infra-base.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/iiot-app.bicep" -OutFile "./bicep/iiot-app.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/acr.bicep" -OutFile "./bicep/modules/acr.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/azurestorage.bicep" -OutFile "./bicep/modules/azurestorage.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/eventhub.bicep" -OutFile "./bicep/modules/eventhub.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/vnet.bicep" -OutFile "./bicep/modules/vnet.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/vnetpeering.bicep" -OutFile "./bicep/modules/vnetpeering.bicep"
Invoke-WebRequest -Uri "$baseLocation/deployment/bicep/modules/loganalytics.bicep" -OutFile "./bicep/modules/loganalytics.bicep"

# Deploy 3 core infrastructure layers i.e. L4, L3, L2, replicating 3 levels of Purdue network topology.
$l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $true -Location $Location
$l3LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l4LevelCoreInfra -ApplicationName ($ApplicationName + "L3") -VnetAddressPrefix "172.18.0.0/16" -SubnetAddressPrefix "172.18.0.0/18" -SetupArc $true -Location $Location
$l2LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l3LevelCoreInfra -ApplicationName ($ApplicationName + "L2") -VnetAddressPrefix "172.20.0.0/16" -SubnetAddressPrefix "172.20.0.0/18" -SetupArc $true -Location $Location
$l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $true -Location $Location -SetupObservability $SetupObservability
$l3LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l4LevelCoreInfra -ApplicationName ($ApplicationName + "L3") -VnetAddressPrefix "172.18.0.0/16" -SubnetAddressPrefix "172.18.0.0/18" -SetupArc $true -Location $Location -SetupObservability $SetupObservability
$l2LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l3LevelCoreInfra -ApplicationName ($ApplicationName + "L2") -VnetAddressPrefix "172.20.0.0/16" -SubnetAddressPrefix "172.20.0.0/18" -SetupArc $true -Location $Location -SetupObservability $SetupObservability

# Deploy core platform layer (Dapr on L4 and L2, Mosquitto broker bridging on L2, L3 and L4).
$l4CorePlatform = ./deploy-core-platform.ps1 -AksClusterName $l4LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName -DeployDapr $true -MosquittoParentConfig $null
Expand Down
42 changes: 27 additions & 15 deletions deployment/deploy-az-dev-bootstrapper.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,15 @@ Param(

[string]
[Parameter(mandatory=$false)]
$Location = 'westeurope'
$Location = 'westeurope',

[Parameter(Mandatory = $false)]
[bool]
$SetupObservability = $true,

[Parameter(Mandatory = $false)]
[bool]
$SetupArc = $false
)

# Import text utilities module.
Expand All @@ -20,56 +28,60 @@ Import-Module -Name ./modules/process-utils.psm1
Write-Title("Start Deploying")
$startTime = Get-Date
$ApplicationName = $ApplicationName.ToLower()

$samplingRate = ($SetupObservability -eq $true) ? "1" : "0" # in development we set to 1, in prod should be 0.0001 or 0, 0 turns off observability
# --- Ensure Location is set to short name
$Location = Get-AzShortRegion($Location)

# --- Deploying 3 layers: comment below block and uncomment bottom block for single layer:

# 1. Deploy core infrastructure (AKS clusters, VNET)

$l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $false -Location $Location
$l3LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l4LevelCoreInfra -ApplicationName ($ApplicationName + "L3") -VnetAddressPrefix "172.18.0.0/16" -SubnetAddressPrefix "172.18.0.0/18" -SetupArc $false -Location $Location
$l2LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l3LevelCoreInfra -ApplicationName ($ApplicationName + "L2") -VnetAddressPrefix "172.20.0.0/16" -SubnetAddressPrefix "172.20.0.0/18" -SetupArc $false -Location $Location
$l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $SetupArc -Location $Location -SetupObservability $SetupObservability
$l3LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l4LevelCoreInfra -ApplicationName ($ApplicationName + "L3") -VnetAddressPrefix "172.18.0.0/16" -SubnetAddressPrefix "172.18.0.0/18" -SetupArc $SetupArc -Location $Location -SetupObservability $SetupObservability
$l2LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ParentConfig $l3LevelCoreInfra -ApplicationName ($ApplicationName + "L2") -VnetAddressPrefix "172.20.0.0/16" -SubnetAddressPrefix "172.20.0.0/18" -SetupArc $SetupArc -Location $Location -SetupObservability $SetupObservability

# # 2. Deploy core platform in each layer (Dapr, Mosquitto and bridging).
$l4CorePlatform = ./deploy-core-platform.ps1 -AksClusterName $l4LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName -DeployDapr $true -MosquittoParentConfig $null -ArcEnabled $false
$l3CorePlatform = ./deploy-core-platform.ps1 -AksClusterName $l3LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l3LevelCoreInfra.AksClusterResourceGroupName -MosquittoParentConfig $l4CorePlatform -ArcEnabled $false
./deploy-core-platform.ps1 -AksClusterName $l2LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l2LevelCoreInfra.AksClusterResourceGroupName -DeployDapr $true -MosquittoParentConfig $l3CorePlatform -ArcEnabled $false
# 2. Deploy core platform in each layer (Dapr, Mosquitto and bridging).
$l4CorePlatform = ./deploy-core-platform.ps1 -AksClusterName $l4LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName -DeployDapr $true -MosquittoParentConfig $null -ArcEnabled $SetupArc
$l3CorePlatform = ./deploy-core-platform.ps1 -AksClusterName $l3LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l3LevelCoreInfra.AksClusterResourceGroupName -MosquittoParentConfig $l4CorePlatform -ArcEnabled $SetupArc
./deploy-core-platform.ps1 -AksClusterName $l2LevelCoreInfra.AksClusterName -AksClusterResourceGroupName $l2LevelCoreInfra.AksClusterResourceGroupName -DeployDapr $true -MosquittoParentConfig $l3CorePlatform -ArcEnabled $SetupArc

# 3. Deploy app resources in Azure, build images and deploy helm on level L4 and L2.
$l4AppConfig = ./deploy-dev-app-l4.ps1 -ApplicationName $ApplicationName `
-AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName `
-AksClusterName $l4LevelCoreInfra.AksClusterName -AksServicePrincipalName ($ApplicationName + "L4") `
-Location $Location
-Location $Location `
-SetupObservability $SetupObservability `
-SamplingRate $samplingRate

# Note currently for developer flow we need Azure Container Registry deployed by L4 (via L4AppConfig).
./deploy-dev-app-l2.ps1 -ApplicationName $ApplicationName `
-AksClusterName $l2LevelCoreInfra.AksClusterName `
-AksClusterResourceGroupName $l2LevelCoreInfra.AksClusterResourceGroupName `
-AksServicePrincipalName ($ApplicationName + "L2") `
-L4AppConfig $l4AppConfig
-L4AppConfig $l4AppConfig `
-SetupObservability $SetupObservability `
-SamplingRate $samplingRate

# # --- Deploying just a single layer: comment above block and uncomment below:

# $l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $false -Location $Location
# $l4LevelCoreInfra = ./deploy-core-infrastructure.ps1 -ApplicationName ($ApplicationName + "L4") -VnetAddressPrefix "172.16.0.0/16" -SubnetAddressPrefix "172.16.0.0/18" -SetupArc $false -Location $Location $SetupObservability

# ./deploy-core-platform.ps1 -AksClusterName $l4LevelCoreInfra.AksClusterName `
# -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName `
# -DeployDapr $true -MosquittoParentConfig $null -ArcEnabled $false
# -DeployDapr $true -MosquittoParentConfig $null -ArcEnabled $SetupArc

# $l4AppConfig = ./deploy-dev-app-l4.ps1 -ApplicationName $ApplicationName `
# -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName `
# -AksClusterName $l4LevelCoreInfra.AksClusterName `
# -AksServicePrincipalName ($ApplicationName + "L4") `
# -Location $Location
# -Location $Location -SetupObservability $SetupObservability

# # when deploying L2 workload on single cluster in L4, passing in parameters pointing to L4 is intentional
# ./deploy-dev-app-l2.ps1 -ApplicationName $ApplicationName `
# -AksClusterName $l4LevelCoreInfra.AksClusterName `
# -AksClusterResourceGroupName $l4LevelCoreInfra.AksClusterResourceGroupName `
# -AksServicePrincipalName ($ApplicationName + "L4") `
# -L4AppConfig $l4AppConfig
# -L4AppConfig $l4AppConfig -SetupObservability $SetupObservability
# #----------------

$runningTime = New-TimeSpan -Start $startTime
Expand Down
Loading
Loading