Skip to content

Feat/recursive parallel indexing#728

Open
nikooo777 wants to merge 4 commits intoneo/edgefrom
feat/recursive-parallel-indexing
Open

Feat/recursive parallel indexing#728
nikooo777 wants to merge 4 commits intoneo/edgefrom
feat/recursive-parallel-indexing

Conversation

@nikooo777
Copy link
Collaborator

@charmful0x this is an alternative for you, it's parallel block processing and Ln unbundling

it has redstone bundle detection and a few other optimizations.

a config that goes with it could be:

{
  "ao-types": "generate_index=atom",
  "generate_index": false,
  "max_connections": 20000,
  "num_acceptors": 512,
  "conn_pool_read_size": 100,
  "arweave_index_ids": true,
  "arweave_index_workers": 32,
  "arweave_block_workers": 20,
  "arweave_index_depth": 2,
  "store": [
    {
      "store-module": "hb_store_arweave",
      "ao-types": "store-module=atom,scope=atom",
      "scope": "remote",
      "index-store": [
        {
          "ao-types": "store-module=atom",
          "store-module": "hb_store_lmdb",
          "name": "/tmp/indexer_alpha",
          "max-readers": 512
        }
      ]
    }
  ],
  "routes": [
    {
      "template": "/graphql",
      "nodes": [
        {
          "prefix": "https://arweave.net",
          "opts": { "ao-types": "http_client=atom,protocol=atom", "http_client": "gun", "protocol": "http2" }
        }
      ]
    },
    {
      "template": "^/arweave",
      "node": {
        "match": "^/arweave",
        "with": "https://arweave.net",
        "opts": { "ao-types": "http_client=atom,protocol=atom", "http_client": "gun", "protocol": "http2" }
      }
    }
  ]
}

it should run without overriding routes as well, but I wanted to bypass the node shuffling logic for my test

you can kick it off with curl -v "http://localhost:PORTNUMBER/~copycat@1.0/arweave&from=-1&to=1867672"

HB_PRINTS that could be useful http_server_short,copycat_short,debug_copycat

- Add parallel block processing for bounded ranges (arweave_block_workers)
- Add header prefetch for auto-stop mode
- Add block completion markers (block/<height>) to LMDB
- Add cutover height for marker-aware auto-stop
- Add per-block error isolation with failure summary events
- Add arweave_marker_cutover_height opt for manual override
- Add configurable arweave_index_depth for multi-level bundle indexing
- Recurse into nested ANS-104 bundles using header-only approach
- Store achieved depth in block markers, reindex when depth increases
- Skip Redstone bundles at L1 and L2+ (log events for observability)
- Add read-ahead chunk cache scoped per recursion level
- Clamp fetched data to ItemSize before parsing across item boundaries
- Guard against inner bundle headers exceeding item data size
- Retry with smaller fetch size when read-ahead overshoots available data
- Track achieved depth as min() across all items/TXs per block
- Batched parallel_map for nested items
- Bounded by arweave_nested_workers (default 10)
- Disable 256KB read-ahead in parallel mode
- take_batch streams items to bound memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant