Skip to content

Conversation

hacheigriega
Copy link
Member

@hacheigriega hacheigriega commented Aug 13, 2025

Explanation of Changes

Implementation of x/core with full data request flow.

  • This PR only implements the messages types that are necessary to simulate a full data request flow. The rest of the messages will be implemented in a separate PR.
  • x/tally has been merged into x/core.
  • Adds a wrapper of x/wasm to continue supporting Core Contract messages that are required for data request fulfillment: AddToAllowList (may not be necessary), Stake, PostDataRequest, CommitDataResult, and RevealDataResult.

States

allowlist:    0x00                            -> []PublicKey
stakers:      0x01 | PublicKey                -> Staker
dataRequests: 0x02 | DR_ID                    -> DataRequest
revealBodies: 0x03 | DR_ID | PublicKey        -> RevealBody
committing:   0x04 | DataRequestIndex         -> ()
revealing:    0x05 | DataRequestIndex         -> ()
tallying:     0x06 | DataRequestIndex         -> ()
timeoutQueue: 0x07 | DR_ID | Timeout_Height   -> ()
params:       0x08                            -> Params

Messages

Owner

  • AddToAllowlist

Staking

  • Stake
  • SetStakingConfig -> UpdateParams

Data Requests

  • PostDataRequest
  • Commit
  • Reveal
  • SetDrConfig -> UpdateParams

Testing

(Write your test plan here)

Dependency Chain

Think about the changes (msgs, events, limits, etc.) made in this PR and see if you need to make an issue or a PR on the following repos.

Related PRs and Issues

Closes #587

@hacheigriega hacheigriega force-pushed the hy/core branch 5 times, most recently from d797d07 to 06b662d Compare August 15, 2025 01:11
@hacheigriega
Copy link
Member Author

Oops, thought I fixed all interchain test fails, but I see there is one still failing. Looking into it now..

@hacheigriega hacheigriega force-pushed the hy/core branch 2 times, most recently from 444f094 to 1716ec1 Compare August 15, 2025 13:50
@hacheigriega hacheigriega changed the base branch from main to dev August 15, 2025 14:18
Copy link
Contributor

@gluax gluax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a break for now so I give this PR the attention to detail it deserves

Comment on lines 66 to 118
if err != nil {
telemetry.SetGauge(1, types.TelemetryKeyDRFlowHalt)
k.Logger(ctx).Error("[HALTS_DR_FLOW] failed to retrieve data request", "err", err)
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this ever happen now that it's also stored in the module? Or ig we still have the shim right so it remains for a while?

Comment on lines 113 to 134
return err
}

err = k.UpdateDataRequestIndexing(ctx, dr.Index(), dr.Status, types.DATA_REQUEST_STATUS_UNSPECIFIED)
if err != nil {
return err
}
dr.Status = types.DATA_REQUEST_STATUS_UNSPECIFIED

err = k.RemoveRevealBodies(ctx, dr.Id)
if err != nil {
return err
}
err = k.RemoveDataRequest(ctx, dr.Id)
if err != nil {
return err
}

dataResults[i].GasUsed = gasMeter.TotalGasUsed()
dataResults[i].Id, err = dataResults[i].TryHash()
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these errors would full exit the processing... is that what we want? Not sure how chain side handled it before. The contract side only errored when it couldn't load the staking config, escrow, staaker or failed to set the response

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For crucial errors like state write errors or hashing errors, we returned an error to halt the chain. If the contract returned an error, we caught the error without halting the chain.
Once we add pausability, we could also pause data request when a serious, unexpected error like this is encountered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good food for thought question. I'm not sure if we'd want to automatically do that, but there's merit in it too. Hmm


// Phase 1: Filtering
//nolint:gosec // G115: No overflow guaranteed by validation logic.
filterResult, filterErr := ExecuteFilter(reveals, dr.ConsensusFilter, uint16(dr.ReplicationFactor), params, gasMeter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I would prefer we don't cast fields every time we use them, and I assume we do the same for other similar fields. Is it painful to have the Protobuf struct have a validation method that would then ensure correct types?

StdOut []string
StdErr []string
Result []byte
ExitCode uint32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally a uint8

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VM returns a uint32 value, so I think we should keep it until we change it from the VM side?

@hacheigriega hacheigriega force-pushed the hy/core branch 3 times, most recently from 59f7878 to 206282d Compare August 16, 2025 12:56
Copy link
Contributor

@gluax gluax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next batch of review comments.

return dists
}

func (k Keeper) ProcessDistributions(ctx sdk.Context, dists []types.Distribution, dr *types.DataRequest, minimumStake math.Int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering in endblock.go in the same module we call GetGasMeterResults and then immediately call this function- it would seem we can just merge them? I.e. no need to build the distribution list in the first place. Or do we ever plan to have logic in-between those two steps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we could. It was just making testing difficult because it is difficult to check the distribution amounts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth merging them, maybe we can make some test utilities to make testing that flow easier.

amount = math.MinInt(dist.DataProxyReward.Amount, remainingEscrow)
payoutAddr, err := sdk.AccAddressFromBech32(dist.DataProxyReward.PayoutAddress)
if err != nil {
// Should not be reachable because the address has been validated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we aren't doing this in this PR. But maybe a log that says unreachable and reason. This way if it ever pops up in the logs we can appropiately panic lol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the validate function of data proxy registration msg does already verify that the payout address is valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeee, just saying we should have a log statement that says unreachable- more as a paranoia thing than anything else.

amount = math.MinInt(dist.ExecutorReward.Amount, remainingEscrow)
staker, err := k.GetStaker(ctx, dist.ExecutorReward.Identity)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract did not error in this case. IT would simply emit an error event and move on to the next iteration of the loop. The return statement would instead exit the function early not doing the payouts at all.

Or is this another unreachable situation due to earlier checks? Or can executors ever unregister so their proxy no longer exists?

TLDR this is a change from the contract.

}

remainingEscrow = remainingEscrow.Sub(amount)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be cautious with the errors in this loop. The contract would never error in this loop and instead would skip over the distrubtion and move on to the next one. The returns here, if I understand correctly, would cause this not to happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I think when we copy more of the tests over from the contract any issues here should be caught

Comment on lines +39 to +43
drID, err := msg.MsgHash()
if err != nil {
return nil, err
}
exists, err := m.HasDataRequest(ctx, drID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difference from contract here- The hash function on contract returns a [u8; 32] and the map stores them by that not the hex string. Which I commented on the keeper why we would prefer this.

Comment on lines +36 to +40
stakers collections.Map[string, types.Staker]

// Data request-related states:
// dataRequests is a map of data request IDs to data request objects.
dataRequests collections.Map[string, types.DataRequest]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far looking at these, this is a change, we stored these by a enforced length of u8 arrays. The reason being this data structure under the hood convertes the Keys to a byte array anyways. So we are going from byte array -> hex string -> byte array again which is wasteful.

Maybe some of the other collcetions are using string keys where they should be byte arrays as well no sure. Going through as I go through.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right. Will try changing the data request ID to bytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have linked to it. Anywhere the key is used, it's as bytes. If you drill down this is done with a copy which is O(n) every time just like byte[]()(for a string)

if err != nil {
return nil, err
}
publicKey, err := hex.DecodeString(msg.PublicKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using the byte array version to check/update see the comment on the keeper to see the rationale there.

Comment on lines +94 to +95
Escrow: msg.Funds.Amount,
TimeoutHeight: ctx.BlockHeight() + int64(drConfig.CommitTimeoutInBlocks),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unsure if these should be attached here or not. I think we can leave it for now but we should do the gas calculation stuff again to see if it improved/worsened things.

@hacheigriega hacheigriega force-pushed the hy/core branch 2 times, most recently from 240a2a7 to 7f9067a Compare August 18, 2025 15:29
Copy link
Contributor

@gluax gluax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job HY!

Comment on lines +238 to +245
// Verify against the stored commit.
expectedCommit, err := msg.RevealHash()
if err != nil {
return nil, err
}
if !bytes.Equal(commit, expectedCommit) {
return nil, types.ErrRevealMismatch
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this check to before the msg.Valdiate call, checking this is cheaper than checking possibly a large amount of proxy public keys. So better to get this check out of the way first and fail the commit sooner than checking all of those then failing.

Comment on lines +32 to +39
func (k Keeper) GetStakersCount(ctx sdk.Context) (uint32, error) {
count := uint32(0)
err := k.stakers.Walk(ctx, nil, func(_ string, _ types.Staker) (stop bool, err error) {
count++
return false, nil
})
return count, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely modify the stakers on the keeper struct to have a count so we don't have to iterate it every time. It's okay-ish for while things are allowlisted, but definitely not for when we turn that off.

Comment on lines +60 to +65
dr, err := k.GetDataRequest(ctx, drID)
if err != nil {
return err
}

err = k.UpdateDataRequestIndexing(ctx, dr.Index(), dr.Status, types.DATA_REQUEST_STATUS_TALLYING)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this is a drawback of storing the status on the Data Request itself. We have to get the data request as well everytime instead of just updating a Status map.

Comment on lines +73 to +75
if !semver.IsValid("v"+m.Version) || semver.Prerelease("v"+m.Version) != "" || semver.Build("v"+m.Version) != "" {
return ErrInvalidVersion.Wrapf("%s", m.Version)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if we need to do this in multiple places- but if we do could be a good ideal to pull this into a separate function and re-use it

return nil
}

func (dc *DataRequestConfig) Validate() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah can ingnore my earlier comment about NonZero stuff I hadn't seen this yet

@hacheigriega hacheigriega marked this pull request as draft September 4, 2025 15:57
- Rename some field names for clarity.
- Use methods instead of direct access to collections.
- Improve comments.
- Improve PostDataRequest message validation. Add relevant tests.
- Change block height types to int64, except for hashing
- Lint
- Fixes a bug where status transition from committing to tallying was not
considered as a possibility.
- Committing, revealing, and tallying are combined into a single collection with
different prefixes.
- DataRequestStatus enum type now follows proto3 requirements.
- Invalid status transition is now identified as an error in UpdateDataRequestIndexing.
- DataRequest's TimeoutHeight is set to -1 if it is not in timeout queue.
- Stop distribution if data request escrow runs out.
- RevealBody.Reveal is now a byte slice instead of a base64 string.
- Core EndBlock now completes the full tally process of a given data request in
one loop iteration.
- Validate that a reveal's exit code can fit in a uint8.
- Validate version in PostDataRequest.
- Basic module parameter validation
@hacheigriega hacheigriega changed the base branch from dev to main September 5, 2025 13:26
@hacheigriega
Copy link
Member Author

I will split up this PR and open new ones since this is getting way too long. I will keep this in draft and make sure to address @gluax 's comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

✨ x/core essentials
3 participants