You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We observed the misbehavior with the MP modules because we are currently still bound to this API by our running TKGi installation. We did not investigate whether it is also faulty with the Policy API modules. I'm not shure if it was fixed for Policy API completely in issue #439.
The background is that the NSX API limits the returns to a maximum of 1000 objects. If there are more objects of one type (in our case e.g. logical switches or logical switch ports), you can see that a "cursor" is returned. The cursor is then used in the next call to get the "next page". You have to loop until no "cursor" comes back, then you have reached the last page and collected all objects.
There after a simple error handling just the result of the API call is returned to the calling function. These in turn rely on the fact that after a call to request(), for example for a GET, they get all objects in the return. So the calling functions don't check if there are more than 1000 objects either.
The usual procedure of MP modules, e.g. when changing the parameters of a logical switch:
GET call via request() to determine all existing LS (incorrectly without cursor mechanism).
In the returned list the corresponding display name is searched for and for this the UUID of the object is stored if it exists
If an ID was found, the object is changed
If the ID was not found, a new object is created
If there are more than 1000 objects, the search will not find the searched object if it is not among the first 1000. Then the Ansible module assumes that an object of that name does not yet exist and creates (on each playbook run) a new object. The multiple objects with the same name then lead to further problems...
Reproduction steps
Development environments or labs are usually small installations with few NSX objects, so the error is usually not noticed there.
To reproduce the error, more than 1000 objects (e.g. 1050) of one type (for example logical switches, created with nsxt_logical_switches) must exist.
Then try to change something in one of the last created objects (change the description or the VLAN-ID with nsxt_logical_switches). It can be a bit difficult to find a suitable object, because it must be >1000 in the list. But because the GET calls have "sort_by": "create_time" set, the last generated object should be a good candidate.
If you look at the objects in the NSX GUI after one or more playbook runs, you will find several of them with identical display names...
Expected behavior
It is expected that the MP modules can also correctly handle object sets >1000 items.
For this purpose the mentioned function request() has to be modified to handle the "cursor" or pagination mechanism correctly.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Problem definition: Pagination was not working as expected for more than 1000 objects in Ansible MP modules
Symptoms: [Enter 2-4 sentences]
If there are more than 1000 objects, the search will not find the searched object if it is not among the first 1000. Then the Ansible module assumes that an object of that name does not yet exist and creates (on each playbook run) a new object. The multiple objects with the same name then lead to further problems
Impact to customer: [Enter 2-4 sentences]
Duplicate objects were created
Steps to reproduce: [Provide the steps to reproduce this problem]
Development environments or labs are usually small installations with few NSX objects, so the error is usually not noticed there.
To reproduce the error, more than 1000 objects (e.g. 1050) of one type (for example logical switches, created with nsxt_logical_switches) must exist.
Then try to change something in one of the last created objects (change the description or the VLAN-ID with nsxt_logical_switches). It can be a bit difficult to find a suitable object, because it must be >1000 in the list. But because the GET calls have "sort_by": "create_time" set, the last generated object should be a good candidate.
If you look at the objects in the NSX GUI after one or more playbook runs, you will find several of them with identical display names...
Workaround: [Enter the workaround, if any, else state “None”] None
Resolution: [Enter the problem’s resolution, if identified, else, state “None”] Fixed as part of #469 and #471
Versions where this is a known issue: [Enter the version numbers where this is a known issue.] master branch
Describe the bug
We observed the misbehavior with the MP modules because we are currently still bound to this API by our running TKGi installation. We did not investigate whether it is also faulty with the Policy API modules. I'm not shure if it was fixed for Policy API completely in issue #439.
The background is that the NSX API limits the returns to a maximum of 1000 objects. If there are more objects of one type (in our case e.g. logical switches or logical switch ports), you can see that a "cursor" is returned. The cursor is then used in the next call to get the "next page". You have to loop until no "cursor" comes back, then you have reached the last page and collected all objects.
This cursor mechanism is not operated by the MP modules. The error is in the request() function in script https://github.com/vmware/ansible-for-nsxt/blob/master/plugins/module_utils/vmware_nsxt.py
There after a simple error handling just the result of the API call is returned to the calling function. These in turn rely on the fact that after a call to request(), for example for a GET, they get all objects in the return. So the calling functions don't check if there are more than 1000 objects either.
The usual procedure of MP modules, e.g. when changing the parameters of a logical switch:
If there are more than 1000 objects, the search will not find the searched object if it is not among the first 1000. Then the Ansible module assumes that an object of that name does not yet exist and creates (on each playbook run) a new object. The multiple objects with the same name then lead to further problems...
Reproduction steps
Development environments or labs are usually small installations with few NSX objects, so the error is usually not noticed there.
"sort_by": "create_time"
set, the last generated object should be a good candidate.Expected behavior
It is expected that the MP modules can also correctly handle object sets >1000 items.
For this purpose the mentioned function request() has to be modified to handle the "cursor" or pagination mechanism correctly.
Additional context
No response
The text was updated successfully, but these errors were encountered: