[taxii2] Infinite loop in handling "next" value #1683

netbroom · 2024-01-10T22:20:14Z

Description

This pull request adds code to check "next" value in TAXII 2.1 against a list of available collections.

However, "next" is not a collection, it's a value that the server provides for pagination based on each server's individual implementation.

From the TAXII 2.1 docs:

This property identifies the server provided value of the next record or set of records in the paginated data set. This property MAY be populated if the more property is set to true.

This value is opaque to the client and represents something that the server knows how to deal with and process.

For example, for a relational database this could be the index autoID, for elastic search it could be the Scroll ID, for other systems it could be a cursor ID, or it could be any string (or int represented as a string) depending on the requirements of the server and what it is doing in the background.

This causes the connector to enter an infinite loop:

# Assuming newer versions will support next
while True:
    objects = process_response(objects, response, version)

    # Check if "more" exists in response and its value is True
    if "more" in response and response["more"] == True:
        filters["next"] = response["next"]

        ###
        ### checks for "next" value in list of collections
        ###
        if (
            response["next"] in self.available_collections
            and len(self.available_collections) > 0
        ):
            response = collection.get_objects(**filters)

        ###
        ### "next" value is not a collection so it will continue in the loop
        ###

    ###
    ### "response" is not updated, meaning "more" is still True, so this will never execute to break the loop
    ###
    else:
        # "more" doesn't exist or is not True, exit the loop
        break

Environment

Reproduced logic with samples of raw taxii2.py code.

Reproducible Steps

Add TAXII2.1 feed using connector.

The text was updated successfully, but these errors were encountered:

annoyingapt · 2024-01-11T11:25:09Z

I think the spirit of the pr was to have a check for next. I would probably recommend rolling back the code and changing it to something like..

if "more" in response and "next" in response and response["more"] == True:

annoyingapt · 2024-01-11T11:36:44Z

@daniele2010 what are your thoughts on this? I haven't come across next not being in root, did you have an example response I could review?

daniele2010 · 2024-01-11T14:23:57Z

Hi, at the time of the PR the problem happened while I was trying to integrate https://osint.digitalside.it/taxiiserver.html

I've debugged the code, below an extract of the response in the PR part:

{'more': True, 'next': 'a73b346e-33df-4b13-b878-c5538aba8d7c', 'objects': [...]

that next key is checked against a list of id:

self.available_collections.extend([c.id for c in api_root.collections])

maybe @netbroom has a different output from the server, but this is the case where I was working at the time and it seems to work.

netbroom · 2024-01-11T17:09:28Z

@annoyingapt I think the safest and quickest option at this time is to roll back the commit before adding additional code as

This check is not needed, as "next" can be any value determined by the server, according to the specification; it does not need to be a collection ID.
This may be impacting or preventing other parts of the platform from running, or causing performance issues, because this connector gets caught in an infinite loop, processing the same response objects each iteration.

Once the issue is fixed though I agree the connector should check if "next" has a value. I believe "next" should also be different than the current "next" value but not totally sure about that. Otherwise the same request will be made in a loop.

@daniele2010 What was the reasoning that "next" must be a collection ID and the job must abort if it isn't?

annoyingapt · 2024-01-11T20:54:44Z

I'm not near a laptop for another couple of weeks, so someone else will need to raise a pr to fix this. I haven't looked at the digital side feed yet, but it might be that it more appropriate to use the misp feed connector to consume their feed. Note to whoever fixes this, if you roll back the code, can you merge the change I made here #1657

netbroom · 2024-01-11T22:58:39Z

@annoyingapt I created #1686 to apply on top of the latest code in master in the interest of time and so your additional changes would not be lost.

netbroom · 2024-01-12T08:06:43Z

#1686 merged into master, closing as completed.

daniele2010 · 2024-01-12T08:51:32Z

@netbroom the problem is that DigitalSide provides a next string that is a collection id but it is not valid:

The response contains this id:

response["next"]
'89b239cf-949b-42f1-b66f-90c7ca5f92eb'

while the root server has only these two collections:

self.available_collections
0: 'e98d6c94-fbce-11ed-b5dd-3bad2ffe9ebf'
1: 'c1f43330-103b-11ee-9ee3-4b022e286589'

so the next instruction would fail because there is no 89b239cf-949b-42f1-b66f-90c7ca5f92eb in the server:

response = collection.get_objects(**filters)

I know this is a problem with DigitalSide server but some sort of control on the parameter is still needed, maybe a control on the validity of the UUID?
It could be implemented a check if the next parameter is a UUID, something like: "if the next is a UUID enter if block in my PR otherwise continue normally"

netbroom · 2024-01-12T09:24:32Z

@daniele2010 I'm not familiar with DigitalSide but it's possible that is a UUID for an object in the collection, not a collection ID.

The next value is determined purely by the server to specify to the client what value should be provided to retrieve the next objects in the collection.

next can be any value. It does not need to be a UUID.

For example if the server uses an integer for indicator IDs stored in the database, the server may provide the last indicator ID in the response so that when the client requests the next page of objects, the server knows where it left off.

In our TAXII feed next is a combination of a timestamp and an indicator ID, for example: 2023-11-16 22:40:05|23070925

The format of next should be validated by the TAXII server, not the client. The client should send back whatever next was provided by the server. If the next value is invalid for whatever reason, the server should respond with an error which the connector should then handle.

I think the only validation that should be done on next is to check if it's the same value as the current next. If the value is the same, the connector and server may get caught in a loop requesting the same objects.

annoyingapt · 2024-01-12T10:13:04Z

@daniele2010 can you try the misp-feed connector instead? https://osint.digitalside.it/Threat-Intel/digitalside-misp-feed/

daniele2010 · 2024-01-12T11:26:51Z

@netbroom So since next can be any value, before calling a collection.get_objects against it, it should be validated to be sure that is a valid id of a collection, in your TAXII feed how it is handled by the clients?

filters["next"] = response["next"]
response = collection.get_objects(**filters)

@annoyingapt thank you, I'll try in the next days, I would have liked to avoid it since I already have a MISP instance and it'd be cleaner to import it from there instead of OpenCTI

annoyingapt · 2024-01-12T11:29:26Z

The opencti misp-feed connector pulls misp feeds like circl. No misp needed. That is a different connector.

netbroom · 2024-01-12T17:20:26Z

So since next can be any value, before calling a collection.get_objects against it, it should be validated to be sure that is a valid id of a collection, in your TAXII feed how it is handled by the clients?

@daniele2010

next is not a collection ID.

Do not validate it.

The connector will break.

annoyingapt · 2024-01-12T21:09:24Z

I really want to have a deep dive into this feed when I get back. But next should only be used for pagination of a single collection which have more objects than what is available in the request. So the server is basically telling you, use this value as a search and I will bring up the next 100 objects for example. The server should not be trying to daisy chain collections together, there is other code in the python code to cycle between collection ids.

netbroom · 2024-01-13T07:18:29Z

@annoyingapt yes exactly.

For TAXII 2 the collection ID is already in the URL, IE /taxii2/[api-root]/collections/[collection ID]/objects/?[filters, including next]

We have a free TAXII 2.1 test collection available to test with here if anyone would like to use it to test this connector.

daniele2010 · 2024-01-13T09:26:14Z

@annoyingapt you're right, it shouldn't be done that way but it is still valid, since in next parameter you can put anything as @netbroom said.

The thing that I think is not correct, but please explain to me if I'm wrong, is that in case next is not a collection id the method collection.get_objects shouldn't get called because it is not a valid id.

Moreover, in my case if the client asks for those objects it couldn't find them and it would crash.

So, to recap, before my PR there was a bug with the integration that would not let me extract data from digitalside TAXII server, with my PR I have introduced a bug, but the solution I think is not reverse the PR but introduce a fix that would let anyone use any TAXII sever, otherwise we would lose the benefit of using this connector.

annoyingapt · 2024-01-13T09:52:14Z

The next variable is being sent as a filter argument. Don't worry, I'll have a look at the feed soon and can update you with what I find. There is also a taxii client built into the opencti gui you can use too, which is independent of this connector.

annoyingapt · 2024-01-13T10:26:52Z

I get the feeling that when you changed the code, the uuid in next might not have been a collection id like you thought, so by not matching, you aren't pulling the second request and creating the error you were seeing. But this is all hypothetical.

netbroom · 2024-01-13T21:00:48Z

The thing that I think is not correct, but please explain to me if I'm wrong, is that in case next is not a collection id the method collection.get_objects shouldn't get called because it is not a valid id.

@daniele2010 collection is the Python object that handles each collection ID and requests to that collection. The TAXII client creates a collection object when you specify which collection you want to get objects from.

So when collection.get_objects() is called, the objects only from that collection are queried. The filters are only applied for that collection.

next is just a filter on the same collection used for keeping track of which objects in that collection were already received. It doesn't tell the server which collection to query; that is a different part of the code that initializes the collection object.

annoyingapt · 2024-01-22T20:29:06Z

I just spun up the misp feed connector with the digitalside feed and it works out of the box on OpenCTI v5.11.14. I'll test their taxii server now.

annoyingapt · 2024-01-24T08:45:42Z

For visibility.
davidonzo/Threat-Intel#45

netbroom added a commit to netbroom/connectors that referenced this issue Jan 11, 2024

Rolled back 34eb547 as per OpenCTI-Platform#1683

16307f6

netbroom mentioned this issue Jan 11, 2024

Rolled back 34eb547 as per #1683 #1686

Merged

4 tasks

SamuelHassine pushed a commit that referenced this issue Jan 12, 2024

[taxii]Rolled back 34eb547 as per #1683 (#1686)

933fa60

netbroom closed this as completed Jan 12, 2024

SamuelHassine added bug use for describing something not working as expected solved use to identify issue that has been solved (must be linked to the solving PR) labels Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[taxii2] Infinite loop in handling "next" value #1683

[taxii2] Infinite loop in handling "next" value #1683

netbroom commented Jan 10, 2024

annoyingapt commented Jan 11, 2024

annoyingapt commented Jan 11, 2024

daniele2010 commented Jan 11, 2024

netbroom commented Jan 11, 2024

annoyingapt commented Jan 11, 2024

netbroom commented Jan 11, 2024

netbroom commented Jan 12, 2024

daniele2010 commented Jan 12, 2024

netbroom commented Jan 12, 2024

annoyingapt commented Jan 12, 2024

daniele2010 commented Jan 12, 2024

annoyingapt commented Jan 12, 2024

netbroom commented Jan 12, 2024 •

edited

Loading

annoyingapt commented Jan 12, 2024 •

edited

Loading

netbroom commented Jan 13, 2024

daniele2010 commented Jan 13, 2024

annoyingapt commented Jan 13, 2024

annoyingapt commented Jan 13, 2024 •

edited

Loading

netbroom commented Jan 13, 2024

annoyingapt commented Jan 22, 2024 •

edited

Loading

annoyingapt commented Jan 24, 2024

[taxii2] Infinite loop in handling "next" value #1683

[taxii2] Infinite loop in handling "next" value #1683

Comments

netbroom commented Jan 10, 2024

Description

Environment

Reproducible Steps

annoyingapt commented Jan 11, 2024

annoyingapt commented Jan 11, 2024

daniele2010 commented Jan 11, 2024

netbroom commented Jan 11, 2024

annoyingapt commented Jan 11, 2024

netbroom commented Jan 11, 2024

netbroom commented Jan 12, 2024

daniele2010 commented Jan 12, 2024

netbroom commented Jan 12, 2024

annoyingapt commented Jan 12, 2024

daniele2010 commented Jan 12, 2024

annoyingapt commented Jan 12, 2024

netbroom commented Jan 12, 2024 • edited Loading

annoyingapt commented Jan 12, 2024 • edited Loading

netbroom commented Jan 13, 2024

daniele2010 commented Jan 13, 2024

annoyingapt commented Jan 13, 2024

annoyingapt commented Jan 13, 2024 • edited Loading

netbroom commented Jan 13, 2024

annoyingapt commented Jan 22, 2024 • edited Loading

annoyingapt commented Jan 24, 2024

netbroom commented Jan 12, 2024 •

edited

Loading

annoyingapt commented Jan 12, 2024 •

edited

Loading

annoyingapt commented Jan 13, 2024 •

edited

Loading

annoyingapt commented Jan 22, 2024 •

edited

Loading