[GSoC'25]: Add multi-vendor system monitoring and enhanced interface statistics #339

abbi4code · 2025-09-17T13:56:13Z

What kind of change does this PR introduce?

Issue Number:

Fixes #340

Snapshots/Videos:

If relevant, did you update the documentation?

Summary

Does this PR introduce a breaking change?

Checklist

CodeRabbit AI Review

I have reviewed and addressed all critical issues flagged by CodeRabbit AI
I have implemented or provided justification for each non-critical suggestion
I have documented my reasoning in the PR comments where CodeRabbit AI suggestions were not implemented

Test Coverage

I have written tests for all new changes/features
I have verified that test coverage meets or exceeds 95%
I have run the test suite locally and all tests pass

Other information

Have you read the contributing guide?

Summary by CodeRabbit

New Features
- Asynchronous SNMP polling with concurrent device handling.
- Device CPU and memory stats collected, stored, and exposed via GraphQL (SystemStat).
- Added per-interface metrics: unicast packets (in/out), errors, discards.
- Expanded vendor support for process statistics (Cisco and Juniper).
Performance
- Faster, more scalable polling through async I/O and controlled concurrency.
Chores
- Added dependencies for async HTTP and SNMP.
- Updated ignore rules for local test scripts.

… oids

…tem queries MIBs, converted sys MIBs to async

…e polling issue added a max polling timeout

… of IF-MIB

coderabbitai · 2025-09-17T13:56:22Z

Walkthrough

Introduces an asynchronous SNMP polling stack and removes the prior synchronous poller. Converts numerous MIB query methods to async, adds new Cisco/Juniper process queries, updates requirements, and wires async polling into the systemd poller. Extends the DB schema, ingestion, and GraphQL API to store and expose system CPU/memory stats and enhanced interface counters.

Changes

Cohort / File(s)	Summary
Repo/Config `requirements.txt`, `.gitignore`	Add aiohttp and pysnmp==6.2.6; ignore `run_local_tests.sh`.
CLI/Systemd Poller `bin/systemd/switchmap_poller`	Switch sync poll to async: import `async_poll`, use `run_devices(max_concurrent_devices)` and `ConfigPoller.agent_subprocesses()`.
DB Setup Tool `bin/tools/create_db_tables.py`	Remove runtime insertion of dummy OUI; rely on file-driven OUI load.
Async Poller Orchestration `switchmap/poller/async_poll.py`	New async devices/host polling, concurrency via Semaphore, optional HTTP posting; CLI single-host support.
Async SNMP Core `switchmap/poller/snmp/async_snmp_manager.py`, `.../async_poller.py`, `.../async_snmp_info.py`	New async SNMP credential validation, session/query manager, poller, and info aggregator; awaitable GET/WALK/SWALK, caching, error handling.
Removal of Old Sync Stack `switchmap/poller/poll.py`, `switchmap/poller/snmp/poller.py`, `switchmap/poller/snmp/snmp_info.py`, `switchmap/poller/snmp/snmp_manager.py`	Delete synchronous poller, SNMP manager, and info aggregator modules.
SNMP Query Registry `switchmap/poller/snmp/__init__.py`	Register `CiscoProcessQuery` and `JuniperProcessQuery`.
Cisco MIBs (async) `.../mib/cisco/mib_ciscoc2900.py`, `.../mib_ciscocdp.py`, `.../mib_ciscoietfip.py`, `.../mib_ciscostack.py`, `.../mib_ciscovlaniftablerelationship.py`, `.../mib_ciscovlanmembership.py`, `.../mib_ciscovtp.py`, `.../mib/cisco/__init__.py`, `.../mib_ciscoprocess.py`	Convert methods to async and await SNMP calls; add new Cisco process MIB module and export.
Juniper MIBs (async) `.../mib/juniper/mib_junipervlan.py`, `.../mib/juniper/mib_juniperprocess.py`, `.../mib/juniper/__init__.py`	Convert VLAN queries to async with lazy loaders; add Juniper process MIB module and export.
Generic MIBs (async) `.../mib/generic/mib_if.py`, `.../mib_if_64.py`, `.../mib_bridge.py`, `.../mib_entity.py`, `.../mib_essswitch.py`, `.../mib_etherlike.py`, `.../mib_ip.py`, `.../mib_ipv6.py`, `.../mib_lldp.py`, `.../mib_qbridge.py`, `.../mib_snmpv2.py`, `.../base_query.py`	Broad async conversion; concurrency in If/Entity/VTP/Bridge; awaitable supported(); new counters in If MIB; refactors to lazy bridge data.
Server DB Models & API `switchmap/server/db/models.py`, `.../schemas.py`, `.../attributes.py`	Add SystemStat model and GraphQL type/queries; extend L1Interface with new counters and LLDP/CDP fields; add SystemStatAttribute.
DB Table Layer `switchmap/server/db/table/systemstat.py`, `.../table/l1interface.py`, `.../table/__init__.py`, `.../misc/rows.py`	New SystemStat table accessors; extend L1Interface insert/update; add SystemStat namedtuples; add row converters; adjust IDevice and RL1/IL1Interface signatures.
Ingest Pipeline `switchmap/server/db/ingest/update/device.py`	Ingest CPU/memory into SystemStat; integrate into Topology.process(); add new L1Interface counters to inserts.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Timer as systemd timer
  participant Poller as switchmap_poller
  participant AsyncPoll as async_poll.devices
  participant SNMP as async_snmp_manager.Interact
  participant MIB as async_snmp_info.Query
  participant API as Central API

  Timer->>Poller: trigger loop
  Poller->>AsyncPoll: run_devices(max_concurrent)
  AsyncPoll->>AsyncPoll: Build device list, semaphores
  par for each device
    AsyncPoll->>SNMP: Validate credentials / session
    SNMP-->>AsyncPoll: Authorized session or fail
    alt authorized
      AsyncPoll->>MIB: everything()
      MIB-->>AsyncPoll: aggregated misc/system/l1/l2/l3
      AsyncPoll->>API: POST results
      API-->>AsyncPoll: 2xx/4xx/5xx
    else unauthorized/fail
      AsyncPoll-->>AsyncPoll: mark failure
    end
  end
  AsyncPoll-->>Poller: summary (success/fail counts)

sequenceDiagram
  autonumber
  participant Ingest as DB Ingest Topology.process()
  participant SysStat as Topology.systemstat()
  participant Tbl as db.table.systemstat
  participant Model as SystemStat Model

  Ingest->>SysStat: collect CISCO/JUNIPER process metrics
  SysStat->>Tbl: device_exists(idx_device)
  alt exists
    SysStat->>Tbl: update_row(idx_systemstat, row)
  else create
    SysStat->>Tbl: insert_row([row])
  end
  Tbl-->>Ingest: upsert complete

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

[GSoC'25]: Migrate SNMP Poller from easysnmp to async pysnmp #328: Parallel async SNMP migration with similar module additions/removals and MIB async conversions.

Suggested reviewers

DMills27

Poem

A rabbit tapped the network’s vine,
Async winds through OIDs fine.
CPUs hum, mem pools gleam,
Packets counted in a stream.
In graphs we nibble stats so sweet—
Hop, hop—concurrency on fleet feet! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes Check	⚠️ Warning	The PR also introduces large, cross-cutting asynchronous refactors and some unrelated or potentially breaking changes that go beyond the narrow feature ask: removal of legacy modules (snmp_manager.py, snmp_info.py, poll.py) and wholesale async replacements across many MIB modules, public API/signature changes (e.g., IDevice tuple signature losing the 'enabled' field and reordering RL1Interface/IL1Interface), and minor cosmetic edits (.gitignore, create_db_tables) that are not required by issue #340 and may break downstream consumers.	Recommend splitting this work: isolate the feature changes (DB/schema additions, ingestion wiring, new vendor MIBs, and interface metric fields) from the broad async refactor and trivial cosmetic edits; if the async redesign is required, present it as a separate, documented migration PR with an API compatibility matrix and upgrade guide, restore or clearly document any public API removals (e.g., IDevice.enabled), add tests and CI screenshots, and move unrelated small changes (.gitignore, dummy DB insertion) into their own commits so reviewers can approve features and refactors independently.
Description Check	⚠️ Warning	The PR description largely contains the unmodified template and placeholders; aside from "Fixes #340" and unchecked checklist entries, required sections such as "What kind of change", a concise Summary/motivation, descriptions of breaking changes or migration steps, documentation updates, snapshots/outputs, and test/CI evidence are missing, so the description is insufficient for a thorough review.	Please complete the PR template: state the change type, add a clear summary describing motivation and the high-level file-level changes, list any breaking API/schema changes and provide migration steps, attach screenshots or sample outputs and link to documentation updates, and include test/CI results and checklist completions so reviewers can verify correctness and impact before merge.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "[GSoC'25]: Add multi-vendor system monitoring and enhanced interface statistics" is concise, directly reflects the primary intent of the changeset (adding system monitoring and expanded interface stats), and is specific enough for a reviewer scanning history to understand the primary feature being introduced.
Linked Issues Check	✅ Passed	The code changes implement the linked issue goals in [#340]: they add device CPU/memory collection (new SystemStat DB model, ingestion path via Topology.systemstat(), CiscoProcessQuery and JuniperProcessQuery for vendor data), expand interface metrics (new IfQuery methods and new L1Interface DB/row/schema fields for ifIn/ifOut unicast counters, errors, and discards), and expose SystemStat in the GraphQL schema, providing the data surface needed for operational visibility and trends. These coding changes directly address the issue's stated objectives for system resource monitoring and enhanced interface statistics.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-09-17T13:56:24Z

Our Pull Request Approval Process

We have these basic policies to make the approval process smoother for our volunteer team.

Testing Your Code

Please make sure your code passes all tests. Our test code coverage system will fail if either of these two conditions occur:

The overall code coverage drops below the target threshold of the repository
Any file in the pull request has code coverage levels below the repository threshold

The process helps maintain the overall reliability of the code base and is a prerequisite for getting your PR approved. Assigned reviewers regularly review the PR queue and tend to focus on PRs that are passing.

Reviewers

When your PR has been assigned reviewers contact them to get your code reviewed and approved via:

comments in this PR or
our slack channel

Reviewing Your Code

Your reviewer(s) will have the following roles:

arbitrators of future discussions with other contributors about the validity of your changes
point of contact for evaluating the validity of your work
person who verifies matching issues by others that should be closed.
person who gives general guidance in fixing your tests

🎯 Contacting the person who assigned the mentors is not advised unless they make a request. Do not @ the person who did the assignment otherwise.

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

switchmap/poller/snmp/mib/cisco/mib_ciscoietfip.py (1)
71-90: Do not truncate MAC addresses; produce normalized format instead.

mac_value[:12] risks corrupting values if general.octetstr_2_string returns colon‑separated strings (e.g., "00:11:22:33:44:55" becomes "00:11:22:33"). Return the full MAC, optionally normalized.
-        for key, mac_value in values.items():
-            final["cInetNetToMediaPhysAddress"][key] = mac_value[:12]
+        for key, mac_value in values.items():
+            # keep full MAC; normalize if needed (strip colons)
+            mac = mac_value.replace(":", "").lower() if ":" in mac_value else mac_value.lower()
+            final["cInetNetToMediaPhysAddress"][key] = mac
switchmap/poller/snmp/mib/generic/mib_if_64.py (1)

73-87: Async migration incomplete — convert remaining accessors to async and await snmp_object calls

rg output shows multiple occurrences of results = self.snmp_object.swalk(...) in switchmap/poller/snmp/mib/generic/mib_if_64.py inside synchronous accessors — these return coroutines and will break at runtime. Required fixes:

Make layer1() async and await each _get_data(...) call (or await individual fetchers).

Make all ifHC* / ifHighSpeed accessors async and await self.snmp_object.swalk(...); update their callers.

Make _get_data(...) async and await func(); update call sites.

Mirror concurrency from mib_if.py (async gather + semaphore=10) to limit device load.
switchmap/poller/snmp/mib/cisco/mib_ciscovlaniftablerelationship.py (1)
115-121: Fix VLAN index parsing (nodes[-2] bug).

You’re extracting the VLAN ID from the wrong OID node. With normalized=False, the instance index is the last node. Using nodes[-2] will yield 3 for all rows (from the base OID), corrupting results.

Apply:
-        for oid, ifindex in results.items():
-            nodes = oid.split(".")
-            vlan = int(nodes[-2])
-            data_dict[ifindex] = [vlan]
+        for full_oid, ifindex in results.items():
+            nodes = full_oid.split(".")
+            vlan = int(nodes[-1])
+            data_dict[ifindex] = [vlan]
switchmap/poller/snmp/mib/cisco/mib_ciscovtp.py (1)
400-419: Harden VLAN bitset parsing and trunk lookup; fix VLAN off‑by‑one.

Use bytes(value) for hexlify.

Derive bit length from payload (don’t assume 1024).

Avoid KeyError on missing trunk status.

VLAN IDs start at 1.
-        data_dict = defaultdict(dict)
-        length_in_bits = 1024
+        from collections import defaultdict
+        data_dict = defaultdict(list)
+        length_in_bits = None
@@
-        results = await self.snmp_object.swalk(oid, normalized=True)
+        results = await self.snmp_object.swalk(oid, normalized=True)
         for key, value in results.items():
@@
-            vlans_hex = binascii.hexlify(value).decode("utf-8")
-            binary_string = bin(int(vlans_hex, base))[2:].zfill(length_in_bits)
+            buf = bytes(value)
+            vlans_hex = binascii.hexlify(buf).decode("utf-8")
+            # Use actual payload size if not preset
+            bitlen = (len(buf) * 8) if not length_in_bits else length_in_bits
+            binary_string = bin(int(vlans_hex, base))[2:].zfill(bitlen)
@@
-            if trunkstatus[ifindex] == 1:
-                for svlan, state in enumerate(binary_string):
-                    vlan = int(svlan)
+            if trunkstatus.get(ifindex) == 1:
+                for svlan, state in enumerate(binary_string, start=1):
+                    vlan = svlan
                     if int(state) == 1:
-                        if ifindex in data_dict:
-                            data_dict[ifindex].append(vlan)
-                        else:
-                            data_dict[ifindex] = [vlan]
+                        data_dict[ifindex].append(vlan)
switchmap/poller/snmp/mib/cisco/mib_ciscostack.py (1)
112-117: Fix potential crash: defaultdict(dict) key lookup returns dict, leading to unhashable key.

If a port key is missing in dot1dbaseport, dot1dbaseport[int(key)] yields {} (from defaultdict), and data_dict[ifindex] = value will raise TypeError: unhashable type 'dict'. Use a plain dict in _portifindex and guard with .get() in portduplex.

Apply:
@@
-        results = await self.snmp_object.swalk(oid, normalized=True)
-        for key, value in results.items():
-            # Assign duplex value to ifindex key
-            ifindex = dot1dbaseport[int(key)]
-            data_dict[ifindex] = value
+        results = await self.snmp_object.swalk(oid, normalized=True)
+        for key, value in results.items():
+            # Assign duplex value to ifIndex key; skip if mapping unavailable
+            ifindex = dot1dbaseport.get(int(key))
+            if ifindex is None:
+                continue
+            data_dict[ifindex] = value
@@
-        data_dict = defaultdict(dict)
+        data_dict = {}
Also applies to: 131-133
switchmap/poller/snmp/mib/generic/mib_lldp.py (1)
315-321: KeyError risk when mapping baseport→ifIndex

Direct indexing self._baseportifindex[key_index] will raise if the map is incomplete (you noted this in comments). Use .get() and guard.
-                else:
-                    ifindex = self._baseportifindex[key_index]
+                else:
+                    ifindex = self._baseportifindex.get(key_index)

🧹 Nitpick comments (61)

.gitignore (1)

246-249: Anchor ignore to repo root and clarify section header.

Without a leading slash, this pattern ignores any run_local_tests.sh anywhere in the tree. If you only mean the root file, anchor it. Also align the header with existing section naming.
-#temp ignores
-run_local_tests.sh
+# Temporary ignores (developer-specific)
+# If only the root copy should be ignored, anchor the pattern:
+/run_local_tests.sh
If the intent is to ignore developer-local variants across the repo instead, consider a generic pattern like *.local.sh and documenting it near the top “Developer Section”.

bin/tools/create_db_tables.py (1)

65-73: Remove commented-out seed block; keep repo clean

Since the dummy OUI insertion is intentionally retired, delete the commented code instead of leaving it in place.

Apply this diff within this block:

-        # oui.insert_row(
-        #     IOui(
-        #         oui=None,
-        #         organization=None,
-        #         enabled=1,
-        #     )
-        # )
-        # Skip dummy OUI insertion - real data will be populated from file
+        # Skip dummy OUI insertion — real data will be populated from file

requirements.txt (1)

13-13: Pin aiohttp (and ideally requests) for reproducible builds.

Unpinned network libs risk breakage at deploy time. Please pin aiohttp (and consider pinning requests alongside it).

Example:
-requests
-aiohttp
+requests==2.32.3
+aiohttp==3.9.5
If you prefer range pins, enforce an upper bound (e.g., <4.0).

switchmap/poller/snmp/mib/generic/mib_if_64.py (1)

339-371: Redundant safe parameter branches; both paths are identical.

Both ifhcinoctets and ifhcoutoctets execute the same code in if safe is False and else. Either implement actual failsafe logic or remove the parameter/branch.

Example simplification:
-    def ifhcinoctets(self, safe=False, oidonly=False):
+    async def ifhcinoctets(self, oidonly=False):
         # ...
-        if safe is False:
-            results = self.snmp_object.swalk(oid, normalized=True)
-        else:
-            results = self.snmp_object.swalk(oid, normalized=True)
+        results = await self.snmp_object.swalk(oid, normalized=True)
Apply similar change to ifhcoutoctets.

Also applies to: 372-404

switchmap/poller/snmp/base_query.py (1)

47-65: Async supported(): good; add explicit type hints and minor micro‑polish.

Implementation is correct. Consider adding a return type and simplifying boolean assignment.

-    async def supported(self):
+    async def supported(self) -> bool:
         """Return device's support for the MIB."""
-        validity = False
-        if await self.snmp_object.oid_exists(self.test_oid) is True:
-            validity = True
-        return validity
+        return await self.snmp_object.oid_exists(self.test_oid)

switchmap/poller/snmp/mib/cisco/mib_ciscoietfip.py (1)

92-129: Nit: fix comment typos.

"deximal" → "decimal".

-                # Convert deximal value to hex,
+                # Convert decimal value to hex,

switchmap/poller/snmp/mib/generic/mib_ipv6.py (1)

71-90: Async conversion LGTM; minor doc/comment nits.

Docstring header says "CISCO-IETF-IP-MIB" though this is IPv6; consider aligning.

Fix "deximal" → "decimal".
-"""Class interacts with CISCO-IETF-IP-MIB."""
+"""Class interacts with IPV6-MIB (IPv6 NetToMedia)."""
@@
-                # Convert deximal value to hex,
+                # Convert decimal value to hex,
Also applies to: 92-138

switchmap/poller/snmp/mib/cisco/mib_ciscovlaniftablerelationship.py (1)

105-106: Use list accumulator for possible multi‑VLAN mapping.

Safer to store a list and append to avoid silent overwrite on duplicates.
-        data_dict = defaultdict(dict)
+        from collections import defaultdict
+        data_dict = defaultdict(list)
And adjust the loop:
-            data_dict[ifindex] = [vlan]
+            data_dict[ifindex].append(vlan)

switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py (3)

49-52: Vendor/log message mismatch and stray print.

Avoid print in library code and fix Cisco→Juniper text.

-        except Exception as e:
-            print(f"Error in Cisco system queries: {e}")
-            log.log2warning(1316, f" Error in Cisco Process MIB: {e}")
+        except Exception as e:
+            log.log2warning(1316, f"Juniper process/memory query error: {e}")

81-84: Coerce pyasn1 values to int before summation.

Guard against non‑native ints from SNMP.

-        used_memory = sum(results.values())
+        used_memory = sum(int(v) for v in results.values())

93-95: Same here for free memory.

-        free_memory = sum(results.values())
+        free_memory = sum(int(v) for v in results.values())

switchmap/poller/snmp/mib/cisco/mib_ciscovtp.py (1)

131-135: Catching bare Exception.

Prefer narrowing to expected SNMP exceptions or at least log the class name for triage.

-                except Exception as e:
-                    log.log2warning(
-                        1001, f"CISCO-VTP layer1 query failed: {name}: {e}"
-                    )
+                except Exception as e:
+                    log.log2warning(
+                        1001,
+                        f"CISCO-VTP layer1 query failed: {name}: {e.__class__.__name__}: {e}",
+                    )

switchmap/poller/snmp/mib/generic/mib_entity.py (1)

92-111: Make gather resilient to partial failures.

Returning exceptions currently aborts the whole call; handle them and proceed.

-        ) = await asyncio.gather(
+        ) = await asyncio.gather(
             self.entphysicalhardwarerev(),
             self.entphysicalfirmwarerev(),
             self.entphysicalsoftwarerev(),
             self.entphysicalname(),
             self.entphysicalmodelname(),
             self.entphysicalserialnum(),
             self.entphysicalclass(),
             self.entphysicaldescr(),
-        )
+        , return_exceptions=True)
+
+        # Normalize exceptions to empty dicts
+        def _ok(x): return {} if isinstance(x, Exception) else x
+        hw_rev, fw_rev, sw_rev, name, model, serial, classtype, description = map(_ok, (hw_rev, fw_rev, sw_rev, name, model, serial, classtype, description))

switchmap/server/db/attributes.py (1)

14-14: Remove unused import

graphql.pyutils.description isn’t used.
-from graphql.pyutils import description

switchmap/poller/snmp/mib/generic/mib_ip.py (3)

84-109: Fetch both ARP tables concurrently

Small perf win and simpler structure.

-        # Get interface ipNetToMediaTable data
-        values = await self.ipnettomediatable()
-        for key, value in values.items():
-            final["ipNetToMediaTable"][key] = value
-
-        # Get interface ipNetToPhysicalPhysAddress data
-        values = await self.ipnettophysicalphysaddress()
-        for key, value in values.items():
-            final["ipNetToPhysicalPhysAddress"][key] = value
+        ipv4_task = self.ipnettomediatable()
+        ipv6_task = self.ipnettophysicalphysaddress()
+        v4, v6 = await asyncio.gather(ipv4_task, ipv6_task, return_exceptions=True)
+        if isinstance(v4, Exception):
+            log.log2warning(1007, f"IP-MIB IPv4 ARP walk failed: {v4}")
+            v4 = {}
+        if isinstance(v6, Exception):
+            log.log2warning(1008, f"IP-MIB IPv6 ARP walk failed: {v6}")
+            v6 = {}
+        for k, v in v4.items():
+            final["ipNetToMediaTable"][k] = v
+        for k, v in v6.items():
+            final["ipNetToPhysicalPhysAddress"][k] = v

110-146: Add error handling around swalk for IPv4 ARP

Avoids bubbling exceptions that can abort whole layer3 collection.

-        results = await self.snmp_object.swalk(oid, normalized=False)
+        try:
+            results = await self.snmp_object.swalk(oid, normalized=False)
+        except Exception as e:
+            log.log2warning(1009, f"ipNetToMediaTable walk failed: {e}")
+            return data_dict

147-203: Likewise add error handling around IPv6 ARP walk

-        results = await self.snmp_object.swalk(oid, normalized=False)
+        try:
+            results = await self.snmp_object.swalk(oid, normalized=False)
+        except Exception as e:
+            log.log2warning(1010, f"ipNetToPhysicalPhysAddress walk failed: {e}")
+            return data_dict

switchmap/poller/snmp/mib/cisco/mib_ciscovlanmembership.py (1)

70-95: Run vmVlan and vmPortStatus concurrently to reduce poll latency.

These two awaits are independent; gather them for a small but free win and consistency with other async MIBs.

Apply:

-        # Get interface vmVlan data
-        values = await self.vmvlan()
-        for key, value in values.items():
+        # Get interface vmVlan and vmPortStatus data concurrently
+        vmvlan_values, vmportstatus_values = await asyncio.gather(
+            self.vmvlan(), self.vmportstatus()
+        )
+        for key, value in vmvlan_values.items():
             final[key]["vmVlan"] = value

-        # Get interface vmPortStatus data
-        values = await self.vmportstatus()
-        for key, value in values.items():
+        for key, value in vmportstatus_values.items():
             final[key]["vmPortStatus"] = value

Add import at top of file:

+import asyncio

switchmap/poller/snmp/mib/cisco/mib_ciscoc2900.py (1)

70-94: Parallelize the two per-port queries.

Both queries are independent; gather them to cut round-trip time.

Apply:

@@
-        # Get interface c2900PortDuplexStatus data
-        values = await self.c2900portduplexstatus()
-        for key, value in values.items():
-            final[key]["c2900PortDuplexStatus"] = value
-
-        # Get interface c2900PortLinkbeatStatus data
-        values = await self.c2900portlinkbeatstatus()
-        for key, value in values.items():
-            final[key]["c2900PortLinkbeatStatus"] = value
+        # Get both datasets concurrently
+        duplex_values, linkbeat_values = await asyncio.gather(
+            self.c2900portduplexstatus(), self.c2900portlinkbeatstatus()
+        )
+        for key, value in duplex_values.items():
+            final[key]["c2900PortDuplexStatus"] = value
+        for key, value in linkbeat_values.items():
+            final[key]["c2900PortLinkbeatStatus"] = value

Add import at top of file if missing:

+import asyncio

switchmap/poller/snmp/async_poller.py (2)

85-87: Unreachable return after fatal log.

log.log2die likely raises; the subsequent return _data is dead code. Remove or downgrade to non-fatal log if you intend to return.
-        if bool(self.snmp_object) is False:
-            log.log2die(1001, f"No valid SNMP object for {self._hostname} ")
-            return _data
+        if bool(self.snmp_object) is False:
+            log.log2die(1001, f"No valid SNMP object for {self._hostname} ")
44-46: Docstring punctuation nit.

Typo: “Returns;” → “Returns:”.
-        Returns;
+        Returns:

switchmap/poller/snmp/mib/generic/mib_qbridge.py (3)

131-159: Ensure bridge mapping is available when dot1qPvid is called directly.

If dot1qpvid() is invoked without prior layer1(), baseportifindex may be None. Load it defensively.

@@
         if oidonly is True:
             return oid
 
-        results = await self.snmp_object.swalk(oid, normalized=True)
+        # Ensure bridge mapping is prepared
+        await self._get_bridge_data()
+
+        results = await self.snmp_object.swalk(oid, normalized=True)
         for key, value in results.items():
             ifindex = self.baseportifindex.get(int(key))
             if ifindex is not None:
                 data_dict[ifindex] = value

160-169: Correct docstring: keys are VLAN IDs, not ifIndex.

dot1qVlanStaticName is indexed by VLAN; update wording.

-        """Return dict of Q-BRIDGE-MIB dot1qVlanStaticName per port.
+        """Return dict of Q-BRIDGE-MIB dot1qVlanStaticName per VLAN.
@@
-            data_dict: Dict of dot1qVlanStaticName using ifIndex as key
+            data_dict: Dict of dot1qVlanStaticName using VLAN ID as key

180-183: Safer decoding of OctetStrings.

Avoid bytes(value) on str; decode bytes/bytearray and fallback to str otherwise.

-        for key, value in results.items():
-            data_dict[key] = str(bytes(value), encoding="utf-8")
+        for key, value in results.items():
+            name = (
+                value.decode("utf-8", errors="replace")
+                if isinstance(value, (bytes, bytearray))
+                else str(value)
+            )
+            data_dict[key] = name

switchmap/server/db/table/systemstat.py (5)

3-3: Remove unused imports.

and_ and null are not used.

-from sqlalchemy import select, update, and_, null 
+from sqlalchemy import select, update

10-35: Clarify behavior and clean up comments.

Function name suggests a boolean, but it returns a row object or False; update the docstring to match behavior or rename the function later.
Remove #! comments (they trigger Ruff EXE00x "shebang" warnings).

 def idx_exists(idx):
-    """Determine whether primary key exists. 
+    """Fetch SystemStat by primary key.
 
-    Args:
-        idx: idx_systemstat 
+    Args:
+        idx: idx_systemstat
 
     Returns: 
-        results: SystemStat object
+        result: SystemStat object or False if not found
 
     """
@@
-    # Return 
+    # Return first match (or False)
     for row in rows:
-        #! i think we are just finding if row exist we are ret bool
         result = _rows.systemstat(row)
-        #! why breaking just after appending result for single row
-        #! are we just checking if there exists a single row or not
         break
     return result

73-78: Avoid set() on possibly unhashable objects; dedupe deterministically.

set(rows) can raise TypeError if objects override eq/hash. Deduplicate by a stable key instead.

-    # Remove any duplicates 
-    rows = list(set(rows))
+    # Remove duplicates by device index (last one wins)
+    _by_device = {}
+    for r in rows:
+        _by_device[r.idx_device] = r
+    rows = list(_by_device.values())

81-88: Guard against missing attributes.

If a caller passes an unexpected object, this will raise. Consider validating required fields.

-    for row in rows:
+    for row in rows:
+        if not hasattr(row, "idx_device"):
+            continue
         inserts.append(
             {
                 "idx_device": row.idx_device,
                 "cpu_5min": row.cpu_5min,
                 "mem_used": row.mem_used,
                 "mem_free": row.mem_free
             }
         )

94-121: Don’t update foreign key unless intentionally re-linking.

Updating idx_device on an existing SystemStat can silently reassign the record. Omit unless there’s a real migration.

         .values(
             {
-                "idx_device": row.idx_device,
                 "cpu_5min": None if row.cpu_5min is None else row.cpu_5min,
                 "mem_used": None if row.mem_used is None else row.mem_used,
                 "mem_free": None if row.mem_free is None else row.mem_free
             }

switchmap/server/db/ingest/update/device.py (2)

8-8: Remove unused import.

-from sqlalchemy import false

156-165: Replace #! notes with standard comments.

These trigger Ruff EXE00x warnings and aren’t actionable in code.

-    #! this is also tricky to understand, understand this as well 
+    # Tracks completion status for SystemStat updates.
@@
-    def systemstat(self):
-        """Provide the value of the 'systemstat' property."""
+    def systemstat(self):
+        """Value of the 'systemstat' property."""
@@
-    def systemstat(self,value):
-        """Set the 'systemstat' property."""
+    def systemstat(self, value):
+        """Set the 'systemstat' property."""

switchmap/poller/snmp/mib/cisco/mib_ciscoprocess.py (5)

52-58: Simplify exception handling.

With return_exceptions=True, asyncio.gather won’t raise. Drop outer try/except; log per-metric if needed.

-        try:
-            cpu_data, memory_used_data, memory_free_data = await asyncio.gather(
+        cpu_data, memory_used_data, memory_free_data = await asyncio.gather(
                 self.cpmcputotal5minrev(),
                 self.memorypoolused(),
                 self.memorypoolfree(),
                 return_exceptions=True
-            )
+            )
@@
-        except Exception as e:
-            print(f"Error in Cisco system queries: {e}")
-            log.log2warning(1316, f" Error in Cisco Process MIB: {e}")
-            return final

72-80: Remove noisy prints; use debug logging if needed.

-            results = await self.snmp_object.swalk(oid, normalized=True) 
-            print(f"results: {results.items()}")
-            print(f"results: {results}")
+            results = await self.snmp_object.swalk(oid, normalized=True)

83-110: Tighten exception handling and types for memory used.

-        try:
-            results = await self.snmp_object.swalk(oid, normalized=True)
-            used_memory = sum(results.values())
-            
-            return used_memory
-        except Exception as e:
-            print(f"Error querying Cisco MemoryPoolUsed: {e}")
-            return None
+        try:
+            results = await self.snmp_object.swalk(oid, normalized=True)
+        except Exception:
+            log.log2warning(1316, "Cisco MemoryPoolUsed walk failed")
+            return None
+        try:
+            return int(sum(int(v) for v in results.values()))
+        except Exception:
+            return None

122-141: Remove unused variable and mirror used-branch handling.

-        # Initialize key variables
-        data_dict = defaultdict(dict)
-        
         # Process OID - Enhanced memory pool free (high capacity)
         oid = ".1.3.6.1.4.1.9.9.48.1.1.1.6"
@@
-        try:
-            results = await self.snmp_object.swalk(oid, normalized=True)
-            free_memory = sum(results.values())
-
-            return free_memory
-           
-        except Exception as e:
-            print(f"Error querying Cisco MemoryPoolFree: {e}")
-            return None
+        try:
+            results = await self.snmp_object.swalk(oid, normalized=True)
+        except Exception:
+            log.log2warning(1316, "Cisco MemoryPoolFree walk failed")
+            return None
+        try:
+            return int(sum(int(v) for v in results.values()))
+        except Exception:
+            return None

60-81: Minor: consistent return type and key casting.

int(key) assumes numeric OID suffix; keep, but consider guarding with try/except to skip bad items.

switchmap/poller/async_poll.py (3)

79-87: Count “skipped” separately to avoid inflating failures.

Skip-file returns False and is counted as failed. Track skipped explicitly for clearer summaries.

-    success_count = sum(1 for r in results if r is True)
-    error_count = sum(1 for r in results if isinstance(r, Exception))
-    failed_count = len(results) - success_count - error_count
+    success_count = sum(1 for r in results if r is True)
+    error_count = sum(1 for r in results if isinstance(r, Exception))
+    skipped_count = sum(1 for r in results if r == "SKIPPED")
+    failed_count = len(results) - success_count - error_count - skipped_count
@@
-        f"{success_count} succeeded, {failed_count} failed, "
-        f"{error_count} errors"
+        f"{success_count} succeeded, {failed_count} failed, "
+        f"{error_count} errors, {skipped_count} skipped"

And in device(), return "SKIPPED" instead of False when skip-file is present.

-            log.log2debug(1404, log_message)
-            return False
+            log.log2debug(1404, log_message)
+            return "SKIPPED"

150-170: Treat all 2xx as success and set a client timeout.

-                        async with session.post(url, json=data) as res:
-                            if res.status == 200:
+                        timeout = aiohttp.ClientTimeout(total=30)
+                        async with session.post(url, json=data, timeout=timeout) as res:
+                            if 200 <= res.status < 300:
                                 log_message = (
                                     f"Successfully polled and posted data "
                                     f"for {hostname}"
                                 )
                                 log.log2debug(1407, log_message)
                             else:

189-192: Add retry with backoff for transient SNMP/HTTP timeouts (per prior preference).

Implement 2–3 retries on asyncio.TimeoutError/aiohttp.ClientError and SNMP timeouts around initialize_snmp() and query().

Would you like me to submit a follow-up patch adding exponential backoff (e.g., 3 retries at 1s/2s/4s) specifically for timeout errors?

switchmap/poller/snmp/mib/juniper/mib_junipervlan.py (4)

167-174: Guard baseport→ifIndex mapping.

Use .get to avoid KeyError when bridge data is incomplete.

-        for key, value in results.items():
-            ifindex = self.baseportifindex.get(int(key))
-            if ifindex is not None:
-                data_dict[ifindex] = value
+        for key, value in results.items():
+            ifindex = self.baseportifindex.get(int(key))
+            if ifindex is not None:
+                data_dict[ifindex] = value

191-207: Avoid KeyError on missing vlan_map/baseport entries.

Both lookups should be safe; skip entries when mappings are absent.

-        for key in sorted(results.keys()):
+        for key in sorted(results.keys()):
             # The key is the full OID. Split this into its component nodes
             nodes = key.split(".")
@@
-            vlan_id = nodes[-2]
-            vlan_tag = self.vlan_map[int(vlan_id)]
+            vlan_id = nodes[-2]
+            vlan_tag = self.vlan_map.get(int(vlan_id))
+            if vlan_tag is None:
+                continue
@@
-            baseport_value = nodes[-1]
-            ifindex = self.baseportifindex[int(baseport_value)]
+            baseport_value = nodes[-1]
+            ifindex = self.baseportifindex.get(int(baseport_value))
+            if ifindex is None:
+                continue

226-233: Robust decode of VLAN names.

Handle non-bytes gracefully.

-        for vlan_id, value in results.items():
+        for vlan_id, value in results.items():
             # Get VLAN tag
-            vlan_tag = self.vlan_map[int(vlan_id)]
+            vlan_tag = self.vlan_map.get(int(vlan_id))
+            if vlan_tag is None:
+                continue
@@
-            data_dict[vlan_tag] = str(bytes(value), encoding="utf-8")
+            if isinstance(value, (bytes, bytearray)):
+                data_dict[vlan_tag] = bytes(value).decode("utf-8", errors="ignore")
+            else:
+                data_dict[vlan_tag] = str(value)

247-258: Minor: type-cast to int explicitly.

Ensure values are ints even if agent returns strings.

-        for key, value in results.items():
+        for key, value in results.items():
             # Process OID
-            data_dict[int(key)] = int(value)
+            try:
+                data_dict[int(key)] = int(value)
+            except Exception:
+                continue

switchmap/poller/snmp/mib/cisco/mib_ciscocdp.py (2)

70-100: Parallelize the three CDP walks and harden against failures

Run the three helpers concurrently and log per-call failures. This matches the async pattern used elsewhere and reduces poll latency.

+from switchmap.core import log
+import asyncio
@@
-        # Get interface cdpCacheDeviceId data
-        values = await self.cdpcachedeviceid()
-        for key, value in values.items():
-            final[key]["cdpCacheDeviceId"] = value
-
-        # Get interface cdpCachePlatform data
-        values = await self.cdpcacheplatform()
-        for key, value in values.items():
-            final[key]["cdpCachePlatform"] = value
-
-        # Get interface cdpCacheDevicePort data
-        values = await self.cdpcachedeviceport()
-        if values is not None:
-            for key, value in values.items():
-                final[key]["cdpCacheDevicePort"] = value
+        # Query all in parallel
+        results = await asyncio.gather(
+            self.cdpcachedeviceid(),
+            self.cdpcacheplatform(),
+            self.cdpcachedeviceport(),
+            return_exceptions=True,
+        )
+        method_names = [
+            "cdpCacheDeviceId",
+            "cdpCachePlatform",
+            "cdpCacheDevicePort",
+        ]
+        for name, values in zip(method_names, results):
+            if isinstance(values, Exception):
+                log.log2warning(1301, f"CISCO-CDP layer1 query failed: {name}: {values}")
+                continue
+            if values:
+                for key, value in values.items():
+                    final[key][name] = value

123-129: Use direct bytes decode; avoid nested str(bytes(...)) conversions

pysnmp returns bytes; decoding directly is clearer and avoids surprises if value is already bytes.

-            data_dict[ifindex] = str(bytes(value), encoding="utf-8")
+            data_dict[ifindex] = value.decode("utf-8", errors="ignore")
@@
-            data_dict[ifindex] = str(bytes(value), encoding="utf-8")
+            data_dict[ifindex] = value.decode("utf-8", errors="ignore")
@@
-            data_dict[ifindex] = str(bytes(value), encoding="utf-8")
+            data_dict[ifindex] = value.decode("utf-8", errors="ignore")

Also applies to: 152-158, 181-187

switchmap/poller/snmp/mib/generic/mib_lldp.py (1)

138-171: Normalize string cleaning for lldpRemSysName (match other getters)

Use general.cleanstring for consistency with lldpRemSysDesc/PortDesc.

-                data_dict[ifindex] = str(bytes(value), encoding="utf-8")
+                data_dict[ifindex] = general.cleanstring(
+                    value.decode("utf-8", errors="ignore")
+                )

switchmap/server/db/models.py (1)

204-228: SystemStat model: add engine args for consistency and review FK default

Consider adding __table_args__ = {"mysql_engine": "InnoDB"} for parity with other tables.
idx_device default/server_default of 1 risks orphaning stats on the wrong device if not explicitly set.

-class SystemStat(BASE):
+class SystemStat(BASE):
@@
-    __tablename__ = "smap_systemstat"
+    __tablename__ = "smap_systemstat"
+    __table_args__ = {"mysql_engine": "InnoDB"}
@@
-    idx_device = Column(
+    idx_device = Column(
         ForeignKey(Device.idx_device, ondelete="CASCADE"),
-        nullable=False,
-        index=True,
-        default=1,
-        server_default=text("1")
+        nullable=False,
+        index=True
     )

switchmap/poller/snmp/async_snmp_info.py (4)

45-46: Top-level container: prefer defaultdict to match downstream assumptions

Past code relied on defaultdict at the top-level for safe nested writes.
-        data = {}
+        data = defaultdict(lambda: defaultdict(dict))
348-356: Use get_running_loop in coroutines (modern asyncio best practice)

asyncio.get_event_loop() is deprecated in coroutines on newer Python.
-            loop = asyncio.get_event_loop()
+            loop = asyncio.get_running_loop()
Apply similarly in _add_layer1/_layer2/_layer3 helpers.

318-336: _add_data doesn’t need to be async

Make it a regular function and drop unnecessary awaits at call sites.
-async def _add_data(source, target):
+def _add_data(source, target):
@@
-    return target
+    return target
And update callers:
-            data = await _add_data(result, data)
+            data = _add_data(result, data)
(Do this in _add_layer1/_layer2/_layer3.)

372-375: Exception handling style nits

Narrow exceptions where feasible or at least log full context (MIB name already logged).

Ruff TRY300: move the final return data into an else to emphasize the normal path.

Also applies to: 405-407, 437-439, 469-471

switchmap/poller/snmp/async_snmp_manager.py (4)

216-219: Avoid blind Exception in reachability check.

Catching bare Exception can hide real faults (e.g., coding errors). Limit to expected network/SNMP errors and log.

Apply this diff:

-        except Exception:
-            # Not Contactable
-            contactable = False
+        except (PySnmpError, asyncio.TimeoutError, OSError) as e:
+            # Not contactable due to SNMP/transport error
+            log.log2debug(1302, f"Reachability check failed: {e}")
+            contactable = False

324-329: Narrow exception scope in _oid_exists_get.

Same concern: broadened catch masks defects; prefer expected failures.

-        except Exception as e:
+        except (PySnmpError, asyncio.TimeoutError, OSError) as e:
             log.log2warning(
                 1305, f"OID existence check failed for {oid_to_get}: {e}"
             )
             return False

354-359: Narrow exception scope in _oid_exists_walk.

Mirror the get-path handling.

-        except Exception as e:
+        except (PySnmpError, asyncio.TimeoutError, OSError) as e:
             log.log2warning(
                 1306, f"Walk existence check failed for {oid_to_get}: {e}"
             )
             return False

836-846: Extract BULK walk limits into named constants.

Hardcoded 50 iterations, 25 reps, and 10k safety cap are fine defaults; promote to module-level constants for clarity and easier tuning.

+# Walk tuning constants
+_BULK_MAX_ITERATIONS = 50
+_BULK_MAX_REPETITIONS = 25
+_WALK_RESULT_CAP = 10_000
...
-            max_iterations = 50
+            max_iterations = _BULK_MAX_ITERATIONS
...
-                        25,
+                        _BULK_MAX_REPETITIONS,
...
-                if len(results) > 10000:
+                if len(results) > _WALK_RESULT_CAP:

Also applies to: 870-876, 924-933

switchmap/poller/snmp/mib/generic/mib_bridge.py (2)

347-351: Use bytes.hex() for MACs to avoid Unicode decode errors.

general.octetstr_2_string() decodes as UTF‑8 first; raw MAC bytes aren’t guaranteed UTF‑8 and may raise UnicodeDecodeError.
-                data_dict[new_key] = general.octetstr_2_string(mac_value)
+                # mac_value is bytes; convert to lowercase hex safely
+                data_dict[new_key] = bytes(mac_value).hex()
If you prefer a central fix, update switchmap/core/general.py:octetstr_2_string to return bytes_string.hex(). I can open a follow-up PR if you want.

440-452: Remove unused helper.

_snmp_octetstr_2_string is unused after the async refactor.
-def _snmp_octetstr_2_string(binary_value):
-    """Convert SNMP OCTETSTR to string.
-    ...
-    """
-    # Convert and return
-    result = "".join(["%0.2x" % ord(_) for _ in binary_value.decode("utf-8")])
-    return result.lower()

switchmap/poller/snmp/mib/generic/mib_if.py (4)

115-136: Add egress errors/discards to meet “enhanced interface statistics.”

Issue #340 asks for error and discard stats; adding ifOutErrors and ifOutDiscards completes the set.

         queries = [
@@
             (self.ifOutMulticastPkts, "ifOutMulticastPkts"),
             (self.ifLastChange, "ifLastChange"),
             (self.ifInUcastPkts,"ifInUcastPkts"),
             (self.ifOutUcastPkts,"ifOutUcastPkts"),
             (self.ifInErrors, "ifInErrors"),
-            (self.ifInDiscards, "ifInDiscards")
+            (self.ifInDiscards, "ifInDiscards"),
+            (self.ifOutErrors, "ifOutErrors"),
+            (self.ifOutDiscards, "ifOutDiscards"),
         ]

Add these methods near the bottom (mirroring existing patterns):

+    async def ifOutErrors(self, oidonly=False):
+        data_dict = defaultdict(dict)
+        oid = ".1.3.6.1.2.1.2.2.1.20"
+        if oidonly is True:
+            return oid
+        results = await self.snmp_object.swalk(oid, normalized=True)
+        for k, v in results.items():
+            data_dict[int(k)] = v
+        return data_dict
+
+    async def ifOutDiscards(self, oidonly=False):
+        data_dict = defaultdict(dict)
+        oid = ".1.3.6.1.2.1.2.2.1.19"
+        if oidonly is True:
+            return oid
+        results = await self.snmp_object.swalk(oid, normalized=True)
+        for k, v in results.items():
+            data_dict[int(k)] = v
+        return data_dict

185-205: Unused ‘safe’ param in octet methods.

Both ifInOctets/ifOutOctets ignore safe. Either remove the param or route safe=True to an alternate path; as-is it’s dead API surface.

Do any callers pass safe= today? If not, I recommend dropping the arg to avoid confusion.

Also applies to: 216-247

273-276: Decode bytes defensively.

For OctetString fields, use .decode(..., errors="replace") to avoid surprises.

-            data_dict[int(key)] = str(bytes(value), encoding="utf-8")
+            data_dict[int(key)] = (
+                value.decode("utf-8", errors="replace")
+                if isinstance(value, (bytes, bytearray))
+                else str(value)
+            )

Apply similarly in ifAlias (Lines 418-421) and ifName (Lines 447-450).

486-509: Prefer bytes.hex() for MAC addresses.

Hex-encoding bytes avoids UTF-8 decode assumptions and is faster/simpler.

-            data_dict[int(key)] = general.octetstr_2_string(value)
+            data_dict[int(key)] = bytes(value).hex()

coderabbitai · 2025-09-17T14:15:50Z

bin/systemd/switchmap_poller

 from switchmap.core import general
 from switchmap.poller.configuration import ConfigPoller
-from switchmap.poller import poll
+from switchmap.poller import async_poll


⚠️ Potential issue

🧩 Analysis chain

Ensure lockfile removal and error logging around async poll run

If async_poll.run_devices raises, the lockfile persists and future runs are blocked. Wrap the call with try/finally and log exceptions.

@@ - # Post data to the remote server + # Post data to the remote server while True: @@ - # Create lockfile - open(self.lockfile, "a").close() - - # Poll after sleeping - async_poll.run_devices(max_concurrent_devices=max_concurrent) - - # Delete lockfile - os.remove(self.lockfile) + # Create lockfile + open(self.lockfile, "a").close() + try: + # Poll after sleeping + async_poll.run_devices(max_concurrent_devices=max_concurrent) + except Exception as e: + log.log2warning(1126, f"Polling run failed: {e}") + finally: + # Always remove lockfile + try: + os.remove(self.lockfile) + except FileNotFoundError: + pass

Also confirm run_devices is sync or async:

And confirm SNMP concurrency semaphore remains 10 per prior guidance:

Also applies to: 77-77, 92-92

🏁 Script executed:

#!/bin/bash rg -nP '^(async\s+)?def\s+run_devices\(' switchmap/poller/async_poll.py -n -C2

Length of output: 114

🏁 Script executed:

#!/bin/bash rg -nP 'asyncio\.Semaphore\(\s*10\s*\)' switchmap/poller/snmp/async_snmp_manager.py -n

Length of output: 51

Ensure lockfile removal and exception logging around async_poll.run_devices

Wrap the run_devices call in try/except/finally so the lockfile is always removed and failures are logged. File: bin/systemd/switchmap_poller (lines 35, 77, 92).

@@ - # Post data to the remote server + # Post data to the remote server while True: @@ - # Create lockfile - open(self.lockfile, "a").close() - - # Poll after sleeping - async_poll.run_devices(max_concurrent_devices=max_concurrent) - - # Delete lockfile - os.remove(self.lockfile) + # Create lockfile + open(self.lockfile, "a").close() + try: + # Poll after sleeping + async_poll.run_devices(max_concurrent_devices=max_concurrent) + except Exception as e: + log.log2warning(1126, f"Polling run failed: {e}") + finally: + # Always remove lockfile + try: + os.remove(self.lockfile) + except FileNotFoundError: + pass

Confirmed: run_devices is synchronous (def run_devices(max_concurrent_devices=None) in switchmap/poller/async_poll.py). Confirmed: SNMP semaphore is asyncio.Semaphore(10) in switchmap/poller/snmp/async_snmp_manager.py:156.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-09-17T14:15:50Z

bin/tools/create_db_tables.py

+        # oui.insert_row(
+        #     IOui(
+        #         oui=None,
+        #         organization=None,
+        #         enabled=1,
+        #     )
+        # )
+        # Skip dummy OUI insertion - real data will be populated from file



⚠️ Potential issue

🧩 Analysis chain

Fail fast if OUI source file is missing/empty; removal of dummy seed changes init guarantees

By skipping the dummy IOui seed, DB init now depends entirely on setup/data/mac_address_file.txt. If that file is absent/empty or parsing fails, IOui may end up empty and downstream lookups could break. Add prechecks and fail fast, and consider wrapping the populate step with basic error handling.

Example guard (place just before calling _oui.update_db_oui(oui_filepath)):

# Pre-check OUI source before population if not os.path.isfile(oui_filepath) or os.path.getsize(oui_filepath) == 0: print(f"ERROR: OUI source not found or empty at: {oui_filepath}") sys.exit(2) try: _oui.update_db_oui(oui_filepath) except Exception as exc: print(f"ERROR: Failed to populate OUI data from {oui_filepath}: {exc}") sys.exit(2)

To verify impacts and file presence, run:

🏁 Script executed:

#!/bin/bash # 1) Ensure the OUI file exists and is non-empty test -f setup/data/mac_address_file.txt && stat -c '%n %s bytes' setup/data/mac_address_file.txt || echo "MISSING: setup/data/mac_address_file.txt" # 2) Inspect code paths that may rely on a sentinel/any OUI rows rg -nP -C2 '\bupdate_db_oui\b|\bIOui\b|mac_address_file\.txt|OUI' --type=py

Length of output: 22593

Fail fast if OUI source file is missing/empty; add error handling around _oui.update_db_oui in bin/tools/create_db_tables.py

Skipping the dummy IOui seed makes DB init depend solely on setup/data/mac_address_file.txt (which is missing in the repo) and _oui.update_db_oui can raise FileNotFoundError or pandas parse/empty-data errors that are currently uncaught. Add a pre-check and wrap the population call:

# Pre-check OUI source before population if not os.path.isfile(oui_filepath) or os.path.getsize(oui_filepath) == 0: print(f"ERROR: OUI source not found or empty at: {oui_filepath}") sys.exit(2) try: _oui.update_db_oui(oui_filepath) except Exception as exc: print(f"ERROR: Failed to populate OUI data from {oui_filepath}: {exc}") sys.exit(2)

Location: bin/tools/create_db_tables.py (around lines 64–76)

🤖 Prompt for AI Agents

In bin/tools/create_db_tables.py around lines 64 to 76, add a pre-check that the OUI source file (oui_filepath) exists and is non-empty using os.path.isfile and os.path.getsize, and if missing/empty print an ERROR with the path and exit with sys.exit(2); then wrap the call to _oui.update_db_oui(oui_filepath) in a try/except that catches Exception, prints an ERROR including the exception message and the filepath, and exits with sys.exit(2) to fail fast on parse/FileNotFound/empty-data errors.

coderabbitai · 2025-09-17T14:15:50Z

requirements.txt

+#Async SNMP polling 
+pysnmp==6.2.6
+# pysnmp 7.x also availble but 6.2.6 is more stable as of now
+


⚠️ Potential issue

pysnmp note is good; also pin pyasn1 to avoid 6.2.6 breakage and fix typos.

With pysnmp 6.2.6, newer pyasn1 (>=0.6.1) can break APIs. Please pin pyasn1 and tidy the comments.

Apply:

-#Async SNMP polling -pysnmp==6.2.6 -# pysnmp 7.x also availble but 6.2.6 is more stable as of now +# Async SNMP polling +pysnmp==6.2.6 +pyasn1==0.6.0 # required for pysnmp 6.2.x compatibility +# pysnmp 7.x also available, but 6.2.6 is more stable in our stack

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#Async SNMP polling

pysnmp==6.2.6

# pysnmp 7.x also availble but 6.2.6 is more stable as of now

# Async SNMP polling

pysnmp==6.2.6

pyasn1==0.6.0 # required for pysnmp 6.2.x compatibility

# pysnmp 7.x also available, but 6.2.6 is more stable in our stack

🤖 Prompt for AI Agents

In requirements.txt around lines 18 to 21, the pysnmp entry and comment should be clarified and pyasn1 pinned to avoid breakage with pysnmp 6.2.6; update the comment to fix typos and remove trailing blank line, and add a pyasn1==0.4.8 (or another tested compatible version) requirement directly below pysnmp to lock the dependency and prevent API breakage.

coderabbitai · 2025-09-17T14:15:50Z

switchmap/poller/snmp/__init__.py

+from .mib.cisco import CiscoProcessQuery



⚠️ Potential issue

🧩 Analysis chain

Align MIB keys and complete Juniper system query before enabling in QUERIES

Cisco memory keys are currently nested under "CISCO-PROCESS-MIB" in CiscoProcessQuery.system(), but ingestion expects "CISCO-MEMORY-POOL-MIB" for memory. This will result in mem_used/mem_free being None in device.systemstat().

JuniperProcessQuery.system() appears incomplete (empty gather, Cisco‑specific keys/msgs). Adding it to QUERIES will call an unfinished collector.

Action:

Move Cisco memory to "CISCO-MEMORY-POOL-MIB".

Implement Juniper.system() with proper tasks and "JUNIPER-MIB" keys.

Proposed fixes (other files):

Cisco memory keys (mib_ciscoprocess.py):

- if memory_used_data and not isinstance(memory_used_data, Exception): - final["CISCO-PROCESS-MIB"]["ciscoMemoryPoolUsed"] = memory_used_data - - if memory_free_data and not isinstance(memory_free_data, Exception): - final["CISCO-PROCESS-MIB"]["ciscoMemoryPoolFree"] = memory_free_data + if memory_used_data and not isinstance(memory_used_data, Exception): + final["CISCO-MEMORY-POOL-MIB"]["ciscoMemoryPoolUsed"] = memory_used_data + if memory_free_data and not isinstance(memory_free_data, Exception): + final["CISCO-MEMORY-POOL-MIB"]["ciscoMemoryPoolFree"] = memory_free_data

Juniper.system() (mib_juniperprocess.py):

- async def system(self): - """Get system resource data from Juniper devices.""" - # Initialize key variables - final = defaultdict(lambda: defaultdict(dict)) - - # Get CPU and memory data concurrently - try: - cpu_data, memory_used_data, memory_free_data = await asyncio.gather( - - ) - - # Populate final results - if cpu_data and not isinstance(cpu_data, Exception): - final["JUNIPER-PROCESS-MIB"]["cpmCPUTotal5minRev"] = cpu_data - - if memory_used_data and not isinstance(memory_used_data, Exception): - final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolUsed"] = memory_used_data - - if memory_free_data and not isinstance(memory_free_data, Exception): - final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolFree"] = memory_free_data - - except Exception as e: - print(f"Error in Cisco system queries: {e}") - log.log2warning(1316, f" Error in Cisco Process MIB: {e}") - return final + async def system(self): + """Get system resource data from Juniper devices.""" + final = defaultdict(lambda: defaultdict(dict)) + try: + cpu_data, mem_used, mem_free = await asyncio.gather( + self.operatingcpu(), + self.operatingmemoryused(), + self.operatingmemoryfree(), + return_exceptions=True, + ) + if cpu_data and not isinstance(cpu_data, Exception): + final["JUNIPER-MIB"]["jnxOperatingCPU"] = cpu_data + if mem_used and not isinstance(mem_used, Exception): + final["JUNIPER-MIB"]["jnxOperatingMemoryUsed"] = mem_used + if mem_free and not isinstance(mem_free, Exception): + final["JUNIPER-MIB"]["jnxOperatingMemoryFree"] = mem_free + except Exception as e: + log.log2warning(1316, f"Error in Juniper MIB: {e}") + return final return final

Verification:

Also applies to: 25-26, 38-51

🏁 Script executed:

#!/bin/bash # Confirm ingestion expects CISCO-MEMORY-POOL-MIB rg -nP 'CISCO-MEMORY-POOL-MIB|ciscoMemoryPool(Used|Free)' switchmap/server/db/ingest/update/device.py # Inspect Juniper system() body rg -nP 'class\s+JuniperProcessQuery|async\s+def\s+system' -n -C3 switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py

Length of output: 812

Move Cisco memory keys to CISCO-MEMORY-POOL-MIB and implement Juniper.system() before enabling in QUERIES

Ingest expects CISCO-MEMORY-POOL-MIB (device reads ciscoMemoryPoolUsed/ciscoMemoryPoolFree — switchmap/server/db/ingest/update/device.py lines 685, 688, 690–691). Change assignments in switchmap/poller/snmp/mib/cisco/mib_ciscoprocess.py from "CISCO-PROCESS-MIB" to "CISCO-MEMORY-POOL-MIB".

Implement JuniperProcessQuery.system() (switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py) — the current method only initializes final and does not gather/assign CPU/memory. Add concurrent calls to operatingcpu(), operatingmemoryused(), operatingmemoryfree() and populate the appropriate JUNIPER keys (e.g., jnxOperatingCPU, jnxOperatingMemoryUsed, jnxOperatingMemoryFree). Do not add Juniper to QUERIES until done.

🤖 Prompt for AI Agents

In switchmap/poller/snmp/__init__.py around lines 22 to 23, update the SNMP MIB handling per the review: change the Cisco memory key assignments in switchmap/poller/snmp/mib/cisco/mib_ciscoprocess.py from "CISCO-PROCESS-MIB" to "CISCO-MEMORY-POOL-MIB" so ciscoMemoryPoolUsed/ciscoMemoryPoolFree are read from the correct MIB; and implement JuniperProcessQuery.system() in switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py to concurrently call operatingcpu(), operatingmemoryused(), and operatingmemoryfree(), populate the JUNIPER keys (jnxOperatingCPU, jnxOperatingMemoryUsed, jnxOperatingMemoryFree) on the result object, and ensure Juniper is not added to QUERIES until this is complete.

coderabbitai · 2025-09-17T14:15:51Z

switchmap/poller/snmp/async_snmp_info.py

+            if isinstance(result, Exception):
+                log.log2warning(f"{key} failed: {result}")
+            elif result:
+                data[key] = result
+        print(f"final data: {data}")


⚠️ Potential issue

Incorrect log2warning signature; this will raise at runtime

log.log2warning requires (code, message). Pass a code and remove the bare print below.

- if isinstance(result, Exception): - log.log2warning(f"{key} failed: {result}") + if isinstance(result, Exception): + log.log2warning(1004, f"{key} failed: {result}") @@ - print(f"final data: {data}") + log.log2debug(1003, f"final data sections: {list(data.keys())}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if isinstance(result, Exception):

log.log2warning(f"{key} failed: {result}")

elif result:

data[key] = result

print(f"final data: {data}")

if isinstance(result, Exception):

log.log2warning(1004, f"{key} failed: {result}")

elif result:

data[key] = result

log.log2debug(1003, f"final data sections: {list(data.keys())}")

🤖 Prompt for AI Agents

In switchmap/poller/snmp/async_snmp_info.py around lines 59 to 63 the call to log.log2warning uses the wrong signature and will raise at runtime; change the call to pass a code and a message (for example the key as code and f"{key} failed: {result}" as message) instead of a single string, and remove the debug print("final data: {data}") so the function logs the failure correctly and does not print to stdout.

coderabbitai · 2025-09-17T14:15:51Z

switchmap/poller/snmp/mib/generic/mib_bridge.py

+        # Get ifindex data directly
+        oid = ".1.3.6.1.2.1.2.2.1.1"
+        ifindex_results = await self.snmp_object.swalk(oid, normalized=True)
+        ifindex_data = {int(k): v for k, v in ifindex_results.items()}
        # Get the difference between ifIndex and dot1dBasePortIfIndex
        oid = ".1.3.6.1.2.1.17.1.4.1.2"
-        results = self._snmp_object.swalk(oid, normalized=True)
+        results = await self.snmp_object.swalk(oid, normalized=True)
        for _bridge_index, ifindex in results.items():
            bridge_index = int(_bridge_index)
            offset = int(ifindex) - bridge_index
            break

        # Populate the dictionary keyed by dot1dBasePortIfIndex
-        for ifindex, _ in sorted(self._ifindex.items()):
+        for ifindex, _ in sorted(ifindex_data.items()):
            bridge_index = ifindex - offset
            data_dict[bridge_index] = ifindex



⚠️ Potential issue

Incorrect baseport→ifIndex mapping (constant-offset assumption).

dot1dBasePortIfIndex is not guaranteed to be a linear offset from ifIndex. Computing a single offset and applying it globally will mis-map ports on many devices.

Replace the offset logic with a direct mapping from BRIDGE-MIB:

- # Get ifindex data directly - oid = ".1.3.6.1.2.1.2.2.1.1" - ifindex_results = await self.snmp_object.swalk(oid, normalized=True) - ifindex_data = {int(k): v for k, v in ifindex_results.items()} - # Get the difference between ifIndex and dot1dBasePortIfIndex - oid = ".1.3.6.1.2.1.17.1.4.1.2" - results = await self.snmp_object.swalk(oid, normalized=True) - for _bridge_index, ifindex in results.items(): - bridge_index = int(_bridge_index) - offset = int(ifindex) - bridge_index - break - - # Populate the dictionary keyed by dot1dBasePortIfIndex - for ifindex, _ in sorted(ifindex_data.items()): - bridge_index = ifindex - offset - data_dict[bridge_index] = ifindex + # Build direct mapping: dot1dBasePortIfIndex.{bridgePort} = ifIndex + oid = ".1.3.6.1.2.1.17.1.4.1.2" + results = await self.snmp_object.swalk(oid, normalized=True) + for bridge_index_str, ifindex in results.items(): + bridge_index = int(bridge_index_str) + data_dict[bridge_index] = int(ifindex)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Get ifindex data directly

oid = ".1.3.6.1.2.1.2.2.1.1"

ifindex_results = await self.snmp_object.swalk(oid, normalized=True)

ifindex_data = {int(k): v for k, v in ifindex_results.items()}

# Get the difference between ifIndex and dot1dBasePortIfIndex

oid = ".1.3.6.1.2.1.17.1.4.1.2"

results = self._snmp_object.swalk(oid, normalized=True)

results = await self.snmp_object.swalk(oid, normalized=True)

for _bridge_index, ifindex in results.items():

bridge_index = int(_bridge_index)

offset = int(ifindex) - bridge_index

break

# Populate the dictionary keyed by dot1dBasePortIfIndex

for ifindex, _ in sorted(self._ifindex.items()):

for ifindex, _ in sorted(ifindex_data.items()):

bridge_index = ifindex - offset

data_dict[bridge_index] = ifindex

# Build direct mapping: dot1dBasePortIfIndex.{bridgePort} = ifIndex

oid = ".1.3.6.1.2.1.17.1.4.1.2"

results = await self.snmp_object.swalk(oid, normalized=True)

for bridge_index_str, ifindex in results.items():

bridge_index = int(bridge_index_str)

data_dict[bridge_index] = int(ifindex)

🤖 Prompt for AI Agents

In switchmap/poller/snmp/mib/generic/mib_bridge.py around lines 371 to 387, the code assumes a constant offset between ifIndex and dot1dBasePortIfIndex and computes a global offset then applies it to all ports; instead, fetch dot1dBasePortIfIndex (.1.3.6.1.2.1.17.1.4.1.2) and build the mapping directly by iterating its results so each bridge port (dot1dBasePort) maps to the exact ifIndex returned by the BRIDGE-MIB (i.e., convert keys and values to int and set data_dict[bridge_port_index] = int(ifindex) for each entry), removing the global offset computation and relying on the direct MIB mapping.

coderabbitai · 2025-09-17T14:15:51Z

switchmap/poller/snmp/mib/generic/mib_ip.py

        super().__init__(snmp_object, "", tags=["layer3"])



⚠️ Potential issue

Don’t override supported() to always True; pass a real test_oid and defer to base

Unconditional True can cause unnecessary queries/timeouts on unsupported devices. Provide a test OID and call the base check.

- super().__init__(snmp_object, "", tags=["layer3"]) + # Use a stable IP-MIB OID to probe support (ipNetToMediaTable) + super().__init__(snmp_object, ".1.3.6.1.2.1.4.22.1.2", tags=["layer3"]) @@ - async def supported(self): - """Return device's support for the MIB. - ... - """ - # Support OID - validity = True - - # Return - return validity + async def supported(self): + """Return device's support for the MIB.""" + return await super().supported()

Also applies to: 68-83

🤖 Prompt for AI Agents

In switchmap/poller/snmp/mib/generic/mib_ip.py around lines 66-67 (and similarly lines 68-83), the supported() method currently returns True unconditionally; change it to provide a concrete test_oid for an IP-related OID and call the base class supported() implementation (or otherwise perform the real SNMP test) so the check defers to the base logic and only returns True when the test OID is present, avoiding needless queries/timeouts on unsupported devices.

coderabbitai · 2025-09-17T14:15:51Z

switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py

+            cpu_data, memory_used_data, memory_free_data = await asyncio.gather(
+
+            )


⚠️ Potential issue

Broken gather: no awaitables → unpack will crash.

await asyncio.gather() with no args returns (), causing ValueError when unpacking into three vars.

- cpu_data, memory_used_data, memory_free_data = await asyncio.gather( - - ) + cpu_data, memory_used_data, memory_free_data = await asyncio.gather( + self.operatingcpu(), + self.operatingmemoryused(), + self.operatingmemoryfree(), + return_exceptions=True, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cpu_data, memory_used_data, memory_free_data = await asyncio.gather(

)

cpu_data, memory_used_data, memory_free_data = await asyncio.gather(

self.operatingcpu(),

self.operatingmemoryused(),

self.operatingmemoryfree(),

return_exceptions=True,

)

coderabbitai · 2025-09-17T14:15:51Z

switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py

+                final["JUNIPER-PROCESS-MIB"]["cpmCPUTotal5minRev"] = cpu_data
+
+            if memory_used_data and not isinstance(memory_used_data, Exception):
+                final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolUsed"] = memory_used_data
+
+            if memory_free_data and not isinstance(memory_free_data, Exception):
+                final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolFree"] = memory_free_data


⚠️ Potential issue

Wrong MIB/metric labels (Cisco names used for Juniper).

Use Juniper names to avoid confusing downstream aggregations.

- if cpu_data and not isinstance(cpu_data, Exception): - final["JUNIPER-PROCESS-MIB"]["cpmCPUTotal5minRev"] = cpu_data + if cpu_data and not isinstance(cpu_data, Exception): + final["JUNIPER-MIB"]["jnxOperatingCPU"] = cpu_data - if memory_used_data and not isinstance(memory_used_data, Exception): - final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolUsed"] = memory_used_data + if memory_used_data and not isinstance(memory_used_data, Exception): + final["JUNIPER-MIB"]["jnxOperatingMemoryUsed"] = memory_used_data - if memory_free_data and not isinstance(memory_free_data, Exception): - final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolFree"] = memory_free_data + if memory_free_data and not isinstance(memory_free_data, Exception): + final["JUNIPER-MIB"]["jnxOperatingMemoryFree"] = memory_free_data

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

final["JUNIPER-PROCESS-MIB"]["cpmCPUTotal5minRev"] = cpu_data

if memory_used_data and not isinstance(memory_used_data, Exception):

final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolUsed"] = memory_used_data

if memory_free_data and not isinstance(memory_free_data, Exception):

final["JUNIPER-MEMORY-POOL-MIB"]["juniperMemoryPoolFree"] = memory_free_data

if cpu_data and not isinstance(cpu_data, Exception):

final["JUNIPER-MIB"]["jnxOperatingCPU"] = cpu_data

if memory_used_data and not isinstance(memory_used_data, Exception):

final["JUNIPER-MIB"]["jnxOperatingMemoryUsed"] = memory_used_data

if memory_free_data and not isinstance(memory_free_data, Exception):

final["JUNIPER-MIB"]["jnxOperatingMemoryFree"] = memory_free_data

🤖 Prompt for AI Agents

In switchmap/poller/snmp/mib/juniper/mib_juniperprocess.py around lines 41-47 the code is assigning Cisco-style MIB/metric labels; replace them with the correct Juniper MIB names and metric keys. Specifically, change the MIB and metric keys to the Juniper equivalents (e.g. use Juniper CPU and memory OIDs/names such as jnx... entries from JUNIPER-PROCESS-MIB / JUNIPER-MEMORY-POOL-MIB instead of Cisco cpm* names), update the dict keys to match the exact Juniper MIB field names, and validate the exact names against the Juniper MIB definitions before committing. Ensure you only change the label strings (no logic change) and run tests to confirm downstream aggregations use the new Juniper names.

coderabbitai · 2025-09-17T14:15:52Z

switchmap/server/db/attributes.py

+class SystemStatAttribute:
+    """Descriptive attributes of the SystemStat table.
+    
+    A generic class to mutualize description of attributes for both queries
+    and mutations.
+
+    """
+
+    idx_systemstat = graphene.Int(description="Primary key index")
+    idx_device = graphene.Int(description="Device Index Foreign key ")
+    cpu_5min = graphene.Int(description=" Device cpu_5min_usage")
+    mem_used = graphene.Int(description="Device memory_used")
+    mem_free = graphene.Int(description="Device memory_free")
+


⚠️ Potential issue

🧩 Analysis chain

Avoid GraphQL Int overflow for memory metrics

GraphQL Int is 32‑bit. mem_used/mem_free in bytes can exceed 2^31‑1, causing serialization errors. Use a safer scalar (e.g., String) or expose values in KB/MB as Int.

Option A: switch to String (bytes)

- mem_used = graphene.Int(description="Device memory_used") - mem_free = graphene.Int(description="Device memory_free") + mem_used = graphene.String(description="Device memory_used (bytes)") + mem_free = graphene.String(description="Device memory_free (bytes)")

Option B: keep Int but change units (update ingestion accordingly)

- mem_used = graphene.Int(description="Device memory_used") - mem_free = graphene.Int(description="Device memory_free") + mem_used = graphene.Int(description="Device memory_used (KB)") + mem_free = graphene.Int(description="Device memory_free (KB)")

If you prefer a BigInt scalar, confirm availability and wire it in:

🌐 Web query:

Does Graphene (v2/v3) provide a built-in BigInt scalar, or should we use a custom scalar/library?

💡 Result:

Short answer: There’s no guaranteed documented BigInt in the public Graphene docs — implement a custom scalar (or use a community scalar library) unless your installed graphene package exposes a BigInt scalar.

Details:

Official Graphene docs list only the standard GraphQL scalars (Int, Float, String, Boolean, ID) and show how to create custom scalars. [1]

The Graphene codebase contains a BigInt scalar implementation in the types module in some releases, so some installed graphene versions may already expose graphene.types.scalars.BigInt — check your installed package/version. [2]

If your version doesn’t include BigInt, create a custom Scalar (serialize as string/int) or use a maintained scalar library (e.g. graphql-scalars / other community packages). [1][3]

References:
[1] Graphene scalars docs.
[2] Graphene types module (BigInt scalar in source).
[3] graphql-scalars (community scalar library).

Fix GraphQL Int overflow for memory metrics (switchmap/server/db/attributes.py:406-419)

GraphQL Int is 32‑bit; mem_used/mem_free in bytes may exceed 2^31‑1 and cause serialization errors. Change the field type or units.

Option A: switch to String (bytes)

- mem_used = graphene.Int(description="Device memory_used") - mem_free = graphene.Int(description="Device memory_free") + mem_used = graphene.String(description="Device memory_used (bytes)") + mem_free = graphene.String(description="Device memory_free (bytes)")

Option B: keep Int but change units (KB)

- mem_used = graphene.Int(description="Device memory_used") - mem_free = graphene.Int(description="Device memory_free") + mem_used = graphene.Int(description="Device memory_used (KB)") + mem_free = graphene.Int(description="Device memory_free (KB)")

If you prefer a BigInt scalar: Graphene does not guarantee a built-in BigInt — confirm whether your installed package exposes graphene.types.scalars.BigInt; if not, add a custom BigInt scalar or use a community scalar library (e.g., graphql-scalars) and wire it in before switching.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class SystemStatAttribute:

"""Descriptive attributes of the SystemStat table.

A generic class to mutualize description of attributes for both queries

and mutations.

"""

idx_systemstat = graphene.Int(description="Primary key index")

idx_device = graphene.Int(description="Device Index Foreign key ")

cpu_5min = graphene.Int(description=" Device cpu_5min_usage")

mem_used = graphene.Int(description="Device memory_used")

mem_free = graphene.Int(description="Device memory_free")

class SystemStatAttribute:

"""Descriptive attributes of the SystemStat table.

A generic class to mutualize description of attributes for both queries

and mutations.

"""

idx_systemstat = graphene.Int(description="Primary key index")

idx_device = graphene.Int(description="Device Index Foreign key ")

cpu_5min = graphene.Int(description=" Device cpu_5min_usage")

mem_used = graphene.String(description="Device memory_used (bytes)")

mem_free = graphene.String(description="Device memory_free (bytes)")

palisadoes · 2025-09-22T13:34:13Z

Please fix the conflicting files

palisadoes · 2025-09-22T13:34:51Z

Please also review the code rabbit suggestions.

abbi4code added 30 commits June 24, 2025 16:18

feat: migration to async_snmp_manager initialize

4b41d69

feat: configure test script for local debugging

193cfbc

feat: methods for snmp_connection check added

b3e7c77

feat: added core methods like walk,bulk etc for diff mpmodels to poll…

624bf32

… oids

feat:[2] added async methods for different versions

4f67de3

feat: added more auth & privacy protocols & validate OID format method

880b0c6

feat: added SNMP value conversion and result formatting helpers

9cbea2a

feat: validate the snmp auth creds & update the cache method

f0a3e98

chore: removed a unused var

0aa8f16

chore: fixed few minor bugs && added get hostname method

0546952

chore: return empty result in case of get/walk polling

e17bf0b

chore: fixed leading dot oid bug && added condition for both formats

6c05f35

feat: added partial test for manager & added walk method for oid prefix

1731380

feat: initialize polling method for misc & system lv data

8d655db

feat: added poll instance for each device & added methods to poll sys…

62e0778

…tem queries MIBs, converted sys MIBs to async

chore: added more validation on output result & fix walkv2 stuck whil…

6390d1c

…e polling issue added a max polling timeout

feat: polled layer lvl data from devices

4dfdd64

fix minor bugs

3353081

feat: migrated all layer1 mibs to async, with concurrent polling

09a2893

reformatted files with black

76b8ed5

chore: concurrent poll for mib_entity oids

857233a

clean up

3ca17ce

feat: layer2 & layer3 async migration completed

f408104

chore: clean up, add logs & make polling concurrent on layer lv

64fb726

chore: lint resolved & add test script for polling

1512d43

lint fixed

de3c227

chore: replace current multiprocessing-based device polling

419f66a

lint fixed

e7a78ab

Merge branch 'async_snmp_migration' into async_device_poll

15c82cc

chore: refactor poll description

de9afba

abbi4code added 24 commits August 10, 2025 20:23

chore: fixed minor bugs

53feedc

chore: fix minor bugs && better err handling

c9f1460

review changes

8b3fe66

Merge branch 'async_snmp_migration' into async_device_poll

714efb6

chore: integrate poller daemon to use our async poller

c9876d6

chore: enhanced poller to reduce the load on device

ef1bb96

lint

f172377

fix formatting

b45711a

docstring fixed, add missing args

577a148

chore: checked and made polling different layers concurrently too

4377b88

removed dubug log

042fbea

Merge branch 'async_snmp_migration' into async_device_poll

e046383

chore: sync files cleanup

ddc7dd9

feat: add new oids for cisco devices for resources

0c42bb0

chore: made all tha changes

7017ce2

feat: added oid to poll memory & cpu matric

9b027b4

feat: replaced sequential server posting to async

23e1cdd

lint & format fix

0f45cc3

docstring ci fix

fda9281

tests out poller on devices

b9caace

Merge branch 'async_device_poll' into add_new_oids

ed2e3d4

added oids for new matrices & cpu/memory for juniper and generic oids…

21eb5e0

… of IF-MIB

chore: fixed posting URL

48fef00

made schema changes for db and graphql for new oids

ece0d23

removed test files

ff0e1d9

coderabbitai bot requested changes Sep 17, 2025

View reviewed changes

		cpu_data, memory_used_data, memory_free_data = await asyncio.gather(

		)

Uh oh!

[GSoC'25]: Add multi-vendor system monitoring and enhanced interface statistics #339

Are you sure you want to change the base?

[GSoC'25]: Add multi-vendor system monitoring and enhanced interface statistics #339

Uh oh!

Conversation

abbi4code commented Sep 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

CodeRabbit AI Review

Test Coverage

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Sep 17, 2025

Our Pull Request Approval Process

Testing Your Code

Reviewers

Reviewing Your Code

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

palisadoes commented Sep 22, 2025

Uh oh!

palisadoes commented Sep 22, 2025

Uh oh!

Uh oh!

abbi4code commented Sep 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 17, 2025 •

edited

Loading