Skip to content

feat: id/depth nested bundles indexing#722

Closed
charmful0x wants to merge 11 commits intoneo/edgefrom
feat/nested-bundles
Closed

feat: id/depth nested bundles indexing#722
charmful0x wants to merge 11 commits intoneo/edgefrom
feat/nested-bundles

Conversation

@charmful0x
Copy link

about

id/depth flow:

  • ~copycat@1.0/arweave&id=ID&depth=N
  • depth=1 => index direct children
  • depth>1 => recurse nested bundle children
  • works for indexed tx@1.0 and ans104@1.0 items

you might see prometheus noise like unknown_metric hb_store_arweave_requests_partition - the indexing works anyway, i tried disabling it manually locally in hb_store_arweave.erl (record_partition_metric/1), it works if disabled

how i tested it locally:

dataitem in nested bundle: https://viewblock.io/arweave/tx/81d_5S8oas728eoaIa30rKmg7xczZ68kjFFTgpfAdBI

bundle 1: https://viewblock.io/arweave/tx/hbi3lraO53m8i4bj2HGvrL7Ye0WD7bcVZ7uM4JqW5GU
bundle 2: https://viewblock.io/arweave/tx/SWFCDZeTOOVtbAvHF5t7vUev-TtEKWnh2t77ANUfwWc

  TestStore = hb_test_utils:test_store().
  StoreOpts = #{<<"index-store">> => [TestStore]}.
  Store = [
    TestStore,
    #{
      <<"store-module">> => hb_store_arweave,
      <<"name">> => <<"cache-arweave">>,
      <<"index-store">> => [TestStore],
      <<"arweave-node">> => <<"https://arweave.net">>
    }
  ].
  Opts = #{
    store => Store,
    arweave_index_ids => true,
    arweave_index_store => StoreOpts
  }.
  Node = hb_http_server:start_node(Opts).
% indexing +1 -1 range from the dataitem's blockheight: 1_866_948
  hb_http:get(
    Node,
    <<"/~copycat@1.0/arweave&from=1866949&to=1866947&mode=write">>,
    #{}
  ).

  hb_http:get(
    Node,
    <<"/~copycat@1.0/arweave&id=SWFCDZeTOOVtbAvHF5t7vUev-TtEKWnh2t77ANUfwWc&depth=2">>,
    #{}
  ).
  % hb_ao:resolve(<<"SWFCDZeTOOVtbAvHF5t7vUev-TtEKWnh2t77ANUfwWc">>, Opts).
  % hb_ao:resolve(<<"hbi3lraO53m8i4bj2HGvrL7Ye0WD7bcVZ7uM4JqW5GU">>, Opts).
  hb_ao:resolve(<<"81d_5S8oas728eoaIa30rKmg7xczZ68kjFFTgpfAdBI">>, Opts).

response:

8> hb_ao:resolve(<<"81d_5S8oas728eoaIa30rKmg7xczZ68kjFFTgpfAdBI">>, Opts).
=ERROR REPORT==== 2-Mar-2026::22:35:42.259154 ===
Error in process <0.2695.0> with exit value:
{{unknown_metric,default,hb_store_arweave_requests_partition},
 [{prometheus_metric,check_mf_exists,4,
                     [{file,"/home/charmful0x/Desktop/fwd-workspace/hyperbeam/_build/default/lib/prometheus/src/prometheus_metric.erl"},
                      {line,191}]},
  {prometheus_counter,insert_metric,6,
                      [{file,"/home/charmful0x/Desktop/fwd-workspace/hyperbeam/_build/default/lib/prometheus/src/metrics/prometheus_counter.erl"},
                       {line,371}]}]}

{ok,#{<<"action">> => <<"Balance">>,
      <<"commitments">> =>
          #{<<"81d_5S8oas728eoaIa30rKmg7xczZ68kjFFTgpfAdBI">> =>
                #{<<"bundle">> => <<"false">>,
                  <<"commitment-device">> => <<"ans104@1.0">>,
                  <<"committed">> =>
                      [<<"action">>,<<"target">>,<<"data-protocol">>,
                       <<"variant">>,<<"type">>,<<"sdk">>],
                  <<"committer">> =>
                      <<"D7NdL5sDGU6q3HahHg-9Un2MpjH4Kj6IFp9k2r_LzCM">>,
                  <<"field-target">> =>
                      <<"0syT13r0s0tgPmIed95bJnuSqaD29HQNN8D3ElLSrsc">>,
                  <<"keyid">> =>
                      <<"publickey:pJCd4wE71hpb0Cc5DIMIbHT_IWL-x-gkrwvmipUop4NotHc6bOJs_b2SdGCz2CeU_F8P-VvUYQN1uCAMi4QF9L"...>>,
                  <<"original-tags">> =>
                      #{<<"1">> =>
                            #{<<"name">> => <<"Action">>,<<"value">> => <<"Balance">>},
                        <<"2">> =>
                            #{<<"name">> => <<"Target">>,
                              <<"value">> =>
                                  <<"D7NdL5sDGU6q3HahHg-9Un2MpjH4Kj6IFp9k2r_LzCM">>},
                        <<"3">> =>
                            #{<<"name">> => <<"Data-Protocol">>,<<"value">> => <<"ao">>},
                        <<"4">> =>
                            #{<<"name">> => <<"Variant">>,<<"value">> => <<"ao.TN.1">>},
                        <<"5">> =>
                            #{<<"name">> => <<"Type">>,<<"value">> => <<"Message">>},
                        <<"6">> =>
                            #{<<"name">> => <<"SDK">>,<<"value">> => <<"aoconnect">>}},
                  <<"signature">> =>
                      <<"Mi_-K3hbEeS3TxcFstxe6eEzZ61-iQGbTiNzU1Ss7hOl5VWZfVu_p6T11B_Tm__r3HvbFcVP-T79F2VH9HDE82PAIIH_eSo3"...>>,
                  <<"type">> => <<"rsa-pss-sha256">>},
            <<"X_uFip3-15ZmKwVK3V7Dik2300nfBJGL7DIjifLKk44">> =>
                #{<<"commitment-device">> => <<"httpsig@1.0">>,
                  <<"committed">> =>
                      [<<"action">>,<<"data-protocol">>,<<"sdk">>,<<"target">>,
                       <<"type">>,<<"variant">>],
                  <<"keyid">> => <<"constant:ao">>,
                  <<"signature">> =>
                      <<"X_uFip3-15ZmKwVK3V7Dik2300nfBJGL7DIjifLKk44">>,
                  <<"type">> => <<"hmac-sha256">>}},
      <<"data-protocol">> => <<"ao">>,
      <<"sdk">> => <<"aoconnect">>,
      <<"target">> =>
          <<"0syT13r0s0tgPmIed95bJnuSqaD29HQNN8D3ElLSrsc">>,
      <<"type">> => <<"Message">>,<<"variant">> => <<"ao.TN.1">>}}
9> 

additionally i added bundle header guard (invalid_bundle_header orelse HeaderSize > Size of) in download_bundle_header/3

@charmful0x
Copy link
Author

tests were working locally (the flow in the PR body), after rebasing, now it's failing on every ~arweave@2.9 client req, i didnt touch that module or adjascent ones.

httpc returns 200, while gun+http2 returns {error, client_error}

14>   hb_http:get(    <<"http://chain-1.arweave.xyz:1984">>,    <<"/block/height/1864412">>,    #{http_client => httpc}  ).
{ok,#{<<"access-control-allow-origin">> => <<"*">>,
      <<"body">> =>
          <<"{\"replica_format\":1,\"packing_difficulty\":10,\"unpacked_chunk_hash\":\"RJqs9hV3UOv5Li4NmmPIpedOQ30FozUkaby21"...>>,
      <<"content-length">> => <<"731845">>,
      <<"date">> => <<"Tue, 03 Mar 2026 14:10:31 GMT">>,
      <<"server">> => <<"Cowboy">>,<<"status">> => 200}}
15>   hb_http:get(    <<"http://chain-1.arweave.xyz:1984">>,    <<"/block/height/1864412">>,    #{http_client => gun, protocol => http2}  ).
=== HB DEBUG ===[654119ms in <0.85.0> @ hb_http:108 / hb_http_client:39 / hb_http_client:196 / hb_http_client:856]==>
unknown_status_class: status_class: error: client_error
{error,client_error}
16>   hb_http:get(    <<"http://tip-1.arweave.xyz:1984">>,    <<"/chunk/384493833199863">>,    #{http_client => httpc}  ).
{ok,#{<<"access-control-allow-origin">> => <<"*">>,
      <<"body">> =>
          <<"{\"data_path\":\"FvUnvHISqEj4LkcbNPDXSpuFBqEzTzhPK_z8sF4O_kUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC8PA\",\"t"...>>,
      <<"content-length">> => <<"65185">>,
      <<"date">> => <<"Tue, 03 Mar 2026 14:11:00 GMT">>,
      <<"server">> => <<"Cowboy">>,<<"status">> => 200}}
17>   hb_http:get(    <<"http://tip-1.arweave.xyz:1984">>,    <<"/chunk/384493833199863">>,    #{http_client => gun, protocol => http2}  ).
=== HB DEBUG ===[27340ms in <0.85.0> @ hb_http:108 / hb_http_client:39 / hb_http_client:196 / hb_http_client:856]==>
unknown_status_class: status_class: error: client_error
{error,client_error}

cc @nikooo777 @samcamwilliams

@charmful0x
Copy link
Author

current filter-protocol=ao policy behavior:

L1 admission

  • only L1 txs with owner matches:
    • ao_bundler_turbo_addr (AO_BUNDLER_ADDR) or
    • ao_bundler_legacy_addr (AO_LEGACY_BUNDLER).

Turbo policy (AO_BUNDLER_ADDR)

  • L2 immediate child indexing:
    • index if child owner is ao_legacy_authority_addr or child has Data-Protocol=ao tag (index both ao bundles and ao direct dataitems -if any)

    • depth 3 recursion:

      • recurse only into nested children that are both:
        • bundle and
        • owner = ao_legacy_authority_addr.
    • inside those recursed nested bundles:

      • children are assumed AO dataitems and indexed via fast path (ao_assume_authority_children=true) - derived from onchain patterns - faster indexing and lesser tags header compute checks

Legacy policy (AO_LEGACY_BUNDLER) - neo uploader

neo uploader seems to have mixed policies, nested-bundles dataitems and L1-bundles with dataitems as direct children. as it's neo-uploader, optimistic AO-filter assumption is safe and result in quicker indexing process

  • L2 immediate child indexing:
    • index all children under that L1 bundle
  • depth 3 recursion:
    • recurse into any nested bundle child
  • inside recursed nested-bundles:
    • descendants assumed AO (same fast path)

non-AO mode

original indexing logic remains, no gating or filtering out

code reference

%% Policies filters
-define(AO_LEGACY_AUTHORITY, <<"fcoN_xJeisVsPXA-trzVAuIiqO3ydLQxM-L4XbrQKzY">>).
%% policy 1: AO messages are L3 dataitems:
%% 1- AO_BUNDLER_ADDR (Ardrive Turbo) sends bundles on Arweave (L1 txs)
%% 2- those bundles (1) are direct parents to nested bundles (nested with AO_LEGACY_AUTHORITY as owner)
%% 3- the (2) nested bundles are parents of L3 dataitems (ao messages)
%% 4- full path: AO_BUNDLER_ADDR L1 TXs (bundles) -> nested bundles owner by AO_LEGACY_AUTHORITY -> ao messages
%% example: hXztSyj_V6PXttCfzkeCWrgul7owCGcmYnz58ydgMCU
-define(AO_BUNDLER_ADDR, <<"JNC6vBhjHY1EPwV3pEeNmrsgFMxH5d38_LHsZ7jful8">>).
%% policy 2.1: AO messages are L2 dataitmes:
%% 1- AO_LEGACY_BUNDLER sends bundles on Arweave (L1 txs)
%% 2- those bundles are direct parents to ao messages (L2 messages, have AO_LEGACY_AUTHORITY as owner)
%% 3- full path: L1 TXs (bundles) -> L2 dataitems (ao messages)
%% example: 8DcCpFij5Dpfd2P7EjeGKZWSpOmpyT1COAM9MNc5VII

%% policy 2.2: AO messages are L3 dataitems
%% 1- AO_LEGACY_BUNDLER sends bundles on Arweave (L1 txs)
%% 2- those bundles (1) are direct parents to nested bundles (nested with AO_LEGACY_AUTHORITY as owner)
%% 3- the (2) nested bundles are parents of L3 dataitems (ao messages)
%% 4- full path: AO_LEGACY_BUNDLER L1 TXs (bundles) -> nested bundles owner by AO_LEGACY_AUTHORITY -> ao messages
%% example: -MpPRIUBCBsWaGebFj-BtD42GKmuw8Wmkw37t_f63-I
-define(AO_LEGACY_BUNDLER, <<"FPjbN_btYKzcf8QASjs30v5C0FPv7XpwKXENBW8dqVw">>).

@JamesPiechota does ao use/d more uploader addresses beside AO_LEGACY_BUNDLER (neo uploader) and AO_BUNDLER_ADDR (Ardrive Turbo)?

@charmful0x
Copy link
Author

closing this PR as the work is continued here #734

@charmful0x charmful0x closed this Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant