Skip to content

0_MV_Prototype

Steven Dake edited this page Sep 1, 2023 · 5 revisions

Overview

Life of Request

Documenting this is essential. @MostAwesomeDude, are you willing to make a diagram using mermaid wiki markdown syntax and place in this section?

Resource list

resource parent resource region endpoints API
InferenceServer oci-BM.GPU.A10 shape Oracle US West Cloud Region 129.153.195.105:9080 API
Ingress server oci-network-load-balancer -> resource_ingress.step() Oracle Cloud US West 144.24.57.25:80, 144.24.57.25:443 No API, Lifecycle API
HTML5 Server Vercel Serverless Package Global CDN 1. Need Oracle IP Reservation, 2. need to link Oracle IP registration to Vercel configuration?:443 API
HTML5 Web Client not applicable local not applicable none

Endpoint List

role FQDN Endpoint status
HTML5 Server https://chat.artificialwisdom.cloud:443 activated
Ingress Service to multiple routes https://api.artificialwisdom.cloud:443 DNS not activated
Inference Server None not applicable

Public Routes List

Documents routes exposed on the ingress role.

TODO

Resources

Detailed definition of resources within our system

InferenceServer

  • 2 (two) Physical NICs
  • 256 (two-hundred-fifty-six) Virtual NICs
  • TODO

This bare metal shape has two capabilities that require environment management.

  • Two NVMe devices that possess 937684566 sectors, where a sector is defined as 4096 bytes.
  • A virtual NIC per

Provision

VNICs

Our network topology - which is undefined demands that we use VNICs when those VNICs are connected to a virtual machine monitor (VMM). If no virtual machine monitor is in use, no VNICs need be provisioned. Therefore, this virtual NIC provisioning module is only applicable for hypervisor use cases.

Our virtual machine monitor is Cloud Hypervisor. There are three different device types that the hypervisor will enumerate:

  • virtio-net
  • vhost-net-user
  • VFIO

There are seven or more methods to create a network device that can then be used by the hypervisor device driver. I have a strong preference towards the use of DPDK: The dataplane Development Kit. Integration with OCI Virtual NICs is an additional challenge. The flexibility of Oracle's secondary VNIC bash script compounded with the language choice does not inspire.

Key properties of the Artificial Wisdom™ platform network provisioning module:

  • Read configuration from V2 metadata server by executing a GET operation against the endpoint http://169.254.169.254/opc/v2/vnics/.
  • Configure a fixed count of SRIOV resources.
  • Enforce VLAN configuration on the SRIOV devices.
  • Rebuilds configuration on every restart.

I am unclear on the precise mechanics. Therefore, more R&D is required.

NVMe

There are two NVMe devices. They need only be provisioned every time a hypervisor (the node) is provisioned.

Steps to provision:

  • Set sector size to 4096 bytes:
sudo lsblk
sudo nvme id-ns -H /dev/nvme0n1
sudo nvme format --lbaf=2 /dev/nvme0n1
  • Wipe NVMe partition table:
sudo wipefs --all --json /dev/nvme0n1
  • Partition NVMe:
sudo cfdisk /dev/nvme0n1
Select GPT, Create one partition of the maximum size, then write.
TODO - convert to sfdisk
  • Format NVMe with XFS:
sudo mkfs.xfs /dev/nvme0n1p1
  • Mount filesystem:
sudo mkdir /awiz
sudo mount /dev/nvme0n1p1 /awiz
  • Persist mount.
TODO