Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed the issues with sonic-clear queuecounter for egress queue and voq #3671

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

saksarav-nokia
Copy link
Contributor

@saksarav-nokia saksarav-nokia commented Dec 10, 2024

What I did

Fixed the sonic-clear queuecounter issues reported in sonic-net/sonic-buildimage#20913
Fixed the following 2 issues

  1. For the egress queue counters, after the queue counter is cleared and show is issued with --nonzero option, the code takes the elif path at line 323 (old code) even if the diff between cached counter and current counter is 0, prints the current counter values from counter_db . This issue was for both egress queue counter and voq counter.
  2. When the sonic-clear queuecounter is issued , the queuestat is called first without --voq option and this gets the port names and queue ids for each port in each asic , reads the egress queue counters and cache the values in /tmp/cache/queuestat/. Then queuestat is called with --voq option and this gets all the system ports names and voq id's for each asic , reads the voq counters and cache the values in /tmp/cache/queuestat. Since each asic has the all the system ports and all the voqs, and since the cache file name is queuestat+system_port_name with out asic namespace, caching asic1's voq counters overwrites the asic0's voq counters .

How I did it

1)Corrected the logic mistake for issue 1. Since cnstat_diff_print is called only if cache file is present and this should print only if non-zero & valid diff or (not non-zero )
2) Added asic namespace with the cache file name for vow counters.

How to verify it

Previous command output (if the output of a command-line utility has changed)

show command output is not changed.

New command output (if the output of a command-line utility has changed)

show command output is not changed.

@saksarav-nokia
Copy link
Contributor Author

saksarav-nokia commented Dec 10, 2024

@rlhui @vmittal-msft @arista-nwolfe @abdosi for viz

@saksarav-nokia saksarav-nokia force-pushed the saksarav-nokia-queue-counter branch from dc373b0 to a80e9e6 Compare December 11, 2024 15:35
@arista-hpandya
Copy link
Contributor

@saksarav-nokia Can you elaborate on the usecase of namespace/asic specific clear commands? I recall when I was working on adding multi-asic support to queuestat we decided to go forward with the goal that a clear would "clear all namespaces".


cnstat_fqn_file_name = cnstat_fqn_file + port
cache_ns = ''
if self.voq and self.namespace is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also handle cases when namespace arg is provided but the voq flag is absent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also handle cases when namespace arg is provided but the voq flag is absent

@arista-hpandya, the asic-namespace is required to be added to the cache file only for the --voq case and not required when --voq is absent. Please check my previous comment #3671 (comment)

cntr['totalpacket'], cntr['totalbytes'],
cntr['droppacket'], cntr['dropbytes']))

table.append((port, cntr['queuetype'] + str(cntr['queueindex']),
Copy link
Contributor

@arista-hpandya arista-hpandya Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test to verify the logic change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test to verify the logic change here?

There are tests already in https://github.com/sonic-net/sonic-utilities/blob/master/tests/queue_counter_test.py which covers these changes.

@saksarav-nokia
Copy link
Contributor Author

@saksarav-nokia Can you elaborate on the usecase of namespace/asic specific clear commands? I recall when I was working on adding multi-asic support to queuestat we decided to go forward with the goal that a clear would "clear all namespaces".

@arista-hpandya , queuestat is called for the following 2 scenarios for sonic-clear queuecounter.

  1. without --voq: - In this case, the egress queue corresponding to front panel ports in the respective asic is cached when QueueStatWrapper calls QueueStat for each asic. In this case asic0 has only asic0's front panel ports and asic1 has only asic1's front panel ports

  2. with --voq: - In this case, the voq counters corresponding to system ports in the respective asic is cached when QueueStatWrapper calls QueueStat for each asic. But as you know, each asic has all the system ports of entire chassis (For J2C+, per asic in each IMM, there are system port for 18 front panel ports, 1 cpu port, 2 recycle ports) . That means asic0 and asic1 has the same set of system ports . So when QueueStat is called for asic0, the cache files are created for all system ports and when the QueueStat is called for asic1, the cache files are created with exact same name which overwrites the cache files created for asic0. So when you show after clear, the show queue counter --voq doesn't work as expected.

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rlhui rlhui added the p0 label Dec 12, 2024
@saksarav-nokia saksarav-nokia force-pushed the saksarav-nokia-queue-counter branch from a80e9e6 to 8e224c5 Compare December 12, 2024 14:05
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@bingwang-ms
Copy link
Contributor

Readding label to trigger auto-cherry-pick

mssonicbld pushed a commit to mssonicbld/sonic-utilities that referenced this pull request Dec 24, 2024
…oq (sonic-net#3671)

Fixed the sonic-clear queuecounter issues reported in sonic-net/sonic-buildimage#20913

For the egress queue counters, after the queue counter is cleared and show is issued with --nonzero option, the code takes the elif path at line 323 (old code) even if the diff between cached counter and current counter is 0, prints the current counter values from counter_db . This issue was for both egress queue counter and voq counter.
When the sonic-clear queuecounter is issued , the queuestat is called first without --voq option and this gets the port names and queue ids for each port in each asic , reads the egress queue counters and cache the values in /tmp/cache/queuestat/. Then queuestat is called with --voq option and this gets all the system ports names and voq id's for each asic , reads the voq counters and cache the values in /tmp/cache/queuestat. Since each asic has the all the system ports and all the voqs, and since the cache file name is queuestat+system_port_name with out asic namespace, caching asic1's voq counters overwrites the asic0's voq counters .
How I did it
1)Corrected the logic mistake for issue 1. Since cnstat_diff_print is called only if cache file is present and this should print only if non-zero & valid diff or (not non-zero )
2) Added asic namespace with the cache file name for vow counters.

Signed-off-by: saksarav <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #3697

mssonicbld pushed a commit that referenced this pull request Dec 24, 2024
…oq (#3671)

Fixed the sonic-clear queuecounter issues reported in sonic-net/sonic-buildimage#20913

For the egress queue counters, after the queue counter is cleared and show is issued with --nonzero option, the code takes the elif path at line 323 (old code) even if the diff between cached counter and current counter is 0, prints the current counter values from counter_db . This issue was for both egress queue counter and voq counter.
When the sonic-clear queuecounter is issued , the queuestat is called first without --voq option and this gets the port names and queue ids for each port in each asic , reads the egress queue counters and cache the values in /tmp/cache/queuestat/. Then queuestat is called with --voq option and this gets all the system ports names and voq id's for each asic , reads the voq counters and cache the values in /tmp/cache/queuestat. Since each asic has the all the system ports and all the voqs, and since the cache file name is queuestat+system_port_name with out asic namespace, caching asic1's voq counters overwrites the asic0's voq counters .
How I did it
1)Corrected the logic mistake for issue 1. Since cnstat_diff_print is called only if cache file is present and this should print only if non-zero & valid diff or (not non-zero )
2) Added asic namespace with the cache file name for vow counters.

Signed-off-by: saksarav <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants