Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up OPC-UA node traversal from hours to minutes #1787

Open
kzawad1-ces opened this issue Feb 12, 2025 · 11 comments
Open

Speeding up OPC-UA node traversal from hours to minutes #1787

kzawad1-ces opened this issue Feb 12, 2025 · 11 comments

Comments

@kzawad1-ces
Copy link

Describe the bug

I am using "asyncua" library to connect to an OPC-UA server and get list of all of the tags available on that server. The Python application takes almost 4 hours to traverse the OPC-UA server, here is the output of the app:

Starting OPC-UA Client Application

Started at: 2025-02-09 06:28:42.764038 
URL: opc.tcp://172.16.1.5:62541
File (Output): Tags_2025-02-09_062842_UTC.txt

Client(url=url)
Set User
Set Password
Set Application_URI
Client set_security
USE_TRUST_STORE: True
if USE_TRUST_STORE
async with client -> get nodes
Called client.node.root
Called to get_child 0:Objects
Call asyncio.wait_for()

Runtime Information
Start Time: 2025-02-09 06:28:42.764038 
End Time:   2025-02-09 10:22:53.466855 
Duration:   3:54:10.702817

The code calls:

await extract_node_paths(objects, "/Objects", output_file)

and the main function looks like this:

async def extract_node_paths(node, path, output_file):
    children = await node.get_children()
    for child in children:
        browse_name = await child.read_browse_name()
        child_path = f"{path}/{browse_name.Name}"
        if await child.read_node_class() == 2:  # Variable node class
            with open(output_file, 'a') as f:
                f.write(child_path + '\n')
        await extract_node_paths(child, child_path, output_file)

The output has the following syntax:

/Objects/Tag Providers/ZLT/ZLT/Dispatcher/Configuration/Limits/LimitBESS
/Objects/Tag Providers/ZLT/ZLT/Dispatcher/Configuration/Limits/LimitEnergyBESS
/Objects/Tag Providers/ZLT/ZLT/Dispatcher/Configuration/Limits/LimitPOI

The output file has 104,529 lines. So 104,529 tags.

The question I have is:

  1. Is it possible to speed this up where it can do this work in 20 minutes?
  2. Is there a built-in API can get a tree list of the tags is much less time?

To Reproduce

Code is shown above.

Expected behavior

I was expecting that there would be a relatively fast (minutes and not hours) way to get a OPC-UA server tags list.

Screenshots

None, but can get some if needed.

Version

Python-Version: Python 3.10.12

opcua-asyncio Version (e.g. master branch, 0.9): asyncua==1.1.5

@AndreasHeine
Copy link
Member

AndreasHeine commented Feb 13, 2025

looks like you fall for some false assumptions ^^

first of all if you want performance you should not rely on "convince" code of the "Node"-PythonClass nor the high level client...

added some quoutes:

async def extract_node_paths(node, path, output_file):
    children = await node.get_children() #  one call to return all childnodes (hierarical references)
    for child in children:
        browse_name = await child.read_browse_name() # now you read the browsename in a loop one at a time and you dont make use of reading all browsenames at once (but you have to keep the server operational limits in mind but thats a value (e.g. 10000 per read / write / browse) you can extract from the server object -> capabilities -> operational limits)
        child_path = f"{path}/{browse_name.Name}"
        if await child.read_node_class() == 2:  # Variable node class # now you do another single network request to extract just one node class
            with open(output_file, 'a') as f: # you open and close the file sync for each child
                f.write(child_path + '\n')
        await extract_node_paths(child, child_path, output_file)

my conclusion your code is not "truly"/"efficiently" async (opening and closing the file all the time, sync which is really bad and blocks eventloop!!!) and you have not thought about what happens on the wire (amount of requests and to much overhead)

@AndreasHeine
Copy link
Member

AndreasHeine commented Feb 13, 2025

here is somthing similar... but its for the subscription to chunck if the operational limits of the server are smaller then the amount of nodes i want to subscribe to... but you can build it for a read/browse call as well

async def subscribe(
    subscription: Subscription, 
    nodes: Union[Node, Iterable[Node]], 
    queuesize = 0,
    maxpercall: int = None, 
    attr = ua.AttributeIds.Value,
    monitoring = ua.MonitoringMode.Reporting
):
    '''
    Subscribing to large numbers of nodes without overwhelming the server!
    '''
    if not isinstance(nodes, list):
        nodes = [nodes]

    if not maxpercall:
        try:
            # try to read the "MaxMonitoredItemsPerCall" value
            maxpercall = await subscription.server.get_node("i=11714").read_value()
            if maxpercall == 0: 
                maxpercall = None
        except:
            maxpercall = None

    if not maxpercall:
        return await subscription.subscribe_data_change(nodes)
    else:
        handle_list = []
        for each in [nodes[i:i + maxpercall] for i in range(0, len(nodes), maxpercall)]:
            handles = await subscription.subscribe_data_change(
                nodes=each,
                attr=attr,
                queuesize=queuesize,
                monitoring=monitoring,
            )
            handle_list.append(handles)
        return handle_list

@oroulet
Copy link
Member

oroulet commented Feb 13, 2025

Arrrg when I got that email I hoped it was a MR ;-).

@oroulet
Copy link
Member

oroulet commented Feb 13, 2025

And yes do not use use several network calls per node. Use Node.get_children_desc() or whatever it is called. That will reduce time.
And then doing these things is slow anyway due to network calls

@kzawad1-ces
Copy link
Author

@AndreasHeine and @oroulet , I get all of your criticisms.

Yes, there is a lot to consider here. I've written hundreds of thousands of lines of code in C/C++. I get it about writing to a file every time and network traffic for parsing the nodes, but my approach with this is not from a purely programming mindset. The reason I am writing to a file every time is because if the app runs for 2 hours and gets an exception, I lost all the tags.

It is with an end outcome mindset. I look at tools like Unified Automation Expert and others and they don't have a way to dump a tag list to file and I thought, someone must have figure this out already.

I will give you subscribe function a try. The 20+ OPC-UA servers I am working with are configured with Ignition SCADA, and the server are configured in so a way that the client has no way to request polling or subscription at certain rates. The server configures the policy and pushes it down. The servers have over 100k tags.

Now, that I have a baseline. I want to see how your function compares to the baseline.

@AndreasHeine
Copy link
Member

AndreasHeine commented Feb 13, 2025

fix for sync file stuff:
just use aiofile instead of sync stuff... mixing sync and asyncio code is always a bad idea (its either or!!!)
see: https://pypi.org/project/aiofile/

fix for unnessesary calls (browse):

#    for child in children:
#        browse_name = await child.read_browse_name() 
bname_data_values = await client.read_attributes(children, ua.AttributeIds.BrowseName)

# if await child.read_node_class() == 2:
nodeclass_data_values = await client.read_attributes(children, ua.AttributeIds.NodeClass)
# do your for loop stuff here

remember read_attributes returns DataValue classes!!!

async def read_attributes(

edit:

the only thing the example above does not is respect the operational limits of the server...
in order to do that you need to read the limit value for read and browse!
then if the list of nodes is longer then the limit value you need to chunk and make multiple requests
e.g. i did for subscription:

        handle_list = []
        for each in [nodes[i:i + maxpercall] for i in range(0, len(nodes), maxpercall)]:
            handles = await subscription.subscribe_data_change(
                nodes=each,
                attr=attr,
                queuesize=queuesize,
                monitoring=monitoring,
            )
            handle_list.append(handles)

Image

@kzawad1-ces
Copy link
Author

For this particular Ignition SCADA OPC-UA server, the values for

MaxNodesPerBrowse: 250
MaxNodesPerRead: 10000

Image

@AndreasHeine
Copy link
Member

AndreasHeine commented Feb 13, 2025

should look like this:

        # maxperbrowse = the value of "MaxNodesPerBrowse" your client can read that after connection!
        results = []
        for each in [children[i:i + maxperbrowse] for i in range(0, len(children), maxperbrowse)]:
            res = await client.read_attributes(each, ua.AttributeIds.BrowseName)
            results.append(res)

and

        # maxperread = the value of "MaxNodesPerRead" your client can read that after connection!
        results = []
        for each in [children[i:i + maxperread] for i in range(0, len(children), maxperread)]:
            res = await client.read_attributes(each, ua.AttributeIds.NodeClass)
            results.append(res)

@kzawad1-ces
Copy link
Author

@AndreasHeine , thank you. I will give it a try.

@milosmatovic
Copy link

milosmatovic commented Feb 27, 2025

Out of curiosity, is there any reason for executing read_attributes one by one? If we want async then it should be something like this, or am i missing something?

# maxperbrowse = the value of "MaxNodesPerBrowse" your client can read that after connection!
tasks= []
for each in [children[i:i + maxperbrowse] for i in range(0, len(children), maxperbrowse)]:
    tasks.append(client.read_attributes(each, ua.AttributeIds.BrowseName))

results = await asyncio.gather(*tasks, return_exceptions=True)

for res in results:
    if isinstance(res, BaseException):
        # Handle error here
    else:
        # Process result here

@AndreasHeine
Copy link
Member

@milosmatovic you can do this... however some servers did not like too many simultaneous requests...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants