Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USA to drop universal application: implement per-wallet application #135

Open
eprbell opened this issue Oct 26, 2024 · 70 comments
Open

USA to drop universal application: implement per-wallet application #135

eprbell opened this issue Oct 26, 2024 · 70 comments

Comments

@eprbell
Copy link
Owner

eprbell commented Oct 26, 2024

Discussion on this topic started here: #134

@eprbell eprbell added the help wanted Extra attention is needed label Oct 26, 2024
@eprbell
Copy link
Owner Author

eprbell commented Nov 4, 2024

Some useful links on this topic:

@macanudo527
Copy link
Collaborator

macanudo527 commented Nov 11, 2024

This is what the great magnificent supercomputer in the sky aka CoPilot had to say about this:

To modify the project to track transactions by individual wallets and calculate the cost basis based on the purchase price for the crypto asset in each wallet, you will need to make changes to several parts of the codebase. Here is a step-by-step guide to help you implement this feature:

  1. Update the TransactionSet class to include wallet information.

  2. Modify the InputData class to handle wallet-specific transactions.

  3. Update the GainLossSet class to calculate gain/loss based on wallet-specific transactions.

  4. Modify the tax_engine.py to use the updated classes and methods.

Step 1: Update the TransactionSet Class

Add wallet information to the TransactionSet class and ensure it can handle transactions for different wallets.

Step 2: Modify the InputData Class

Update the InputData class to handle wallet-specific transactions. This involves adding methods to filter transactions by wallet.

Step 3: Update the GainLossSet Class

Modify the GainLossSet class to calculate gain/loss based on wallet-specific transactions.

Step 4: Modify tax_engine.py

Update the tax_engine.py to use the updated classes and methods.

Here is an example of how you can modify the tax_engine.py:

def _create_unfiltered_taxable_event_set(configuration: Configuration, input_data: InputData) -> Dict[str, TransactionSet]:
    wallet_taxable_event_sets: Dict[str, TransactionSet] = {}
    for wallet in input_data.wallets:
        taxable_event_set: TransactionSet = TransactionSet(configuration, "MIXED", input_data.asset, MIN_DATE, MAX_DATE)
        for transaction_set in [
            input_data.get_unfiltered_in_transaction_set(wallet),
            input_data.get_unfiltered_out_transaction_set(wallet),
            input_data.get_unfiltered_intra_transaction_set(wallet),
        ]:
            for entry in transaction_set:
                transaction = cast(AbstractTransaction, entry)
                if transaction.is_taxable():
                    taxable_event_set.add_entry(transaction)
        wallet_taxable_event_sets[wallet] = taxable_event_set
    return wallet_taxable_event_sets


def _create_unfiltered_gain_and_loss_set(
    configuration: Configuration, accounting_engine: AccountingEngine, input_data: InputData, wallet_taxable_event_sets: Dict[str, TransactionSet]
) -> Dict[str, GainLossSet]:
    wallet_gain_loss_sets: Dict[str, GainLossSet] = {}
    for wallet, taxable_event_set in wallet_taxable_event_sets.items():
        gain_loss_set: GainLossSet = GainLossSet(configuration, input_data.asset, MIN_DATE, MAX_DATE)
        new_accounting_engine: AccountingEngine = accounting_engine.__class__(accounting_engine.years_2_methods)
        taxable_event_iterator: Iterator[AbstractTransaction] = iter(cast(Iterable[AbstractTransaction], taxable_event_set))
        acquired_lot_iterator: Iterator[InTransaction] = iter(cast(Iterable[InTransaction], input_data.get_unfiltered_in_transaction_set(wallet)))

        new_accounting_engine.initialize(taxable_event_iterator, acquired_lot_iterator)

        try:
            gain_loss: GainLoss
            taxable_event: AbstractTransaction
            acquired_lot: Optional[InTransaction]
            taxable_event_amount: RP2Decimal
            acquired_lot_amount: RP2Decimal
            total_amount: RP2Decimal = ZERO

            (taxable_event, acquired_lot, taxable_event_amount, acquired_lot_amount) = _get_next_taxable_event_and_acquired_lot(
                new_accounting_engine, None, None, ZERO, ZERO
            )

            while taxable_event:
                AbstractTransaction.type_check("taxable_event", taxable_event)
                if acquired_lot is None:
                    raise RP2RuntimeError("Parameter 'acquired_lot' is None")
                InTransaction.type_check("acquired_lot", acquired_lot)

                if taxable_event_amount == acquired_lot_amount:
                    gain_loss = GainLoss(configuration, taxable_event_amount, taxable_event, acquired_lot)
                    total_amount += taxable_event_amount
                    gain_loss_set.add_entry(gain_loss)
                    (taxable_event, acquired_lot, taxable_event_amount, acquired_lot_amount) = _get_next_taxable_event_and_acquired_lot(
                        new_accounting_engine, None, None, ZERO, ZERO
                    )
                elif taxable_event_amount < acquired_lot_amount:
                    gain_loss = GainLoss(configuration, taxable_event_amount, taxable_event, acquired_lot)
                    total_amount += taxable_event_amount
                    gain_loss_set.add_entry(gain_loss)
                    (taxable_event, acquired_lot, taxable_event_amount, acquired_lot_amount) = _get_next_taxable_event_and_acquired_lot(
                        new_accounting_engine, taxable_event, acquired_lot, ZERO, acquired_lot_amount - taxable_event_amount
                    )
                else:
                    gain_loss = GainLoss(configuration, acquired_lot_amount, taxable_event, acquired_lot)
                    total_amount += acquired_lot_amount
                    gain_loss_set.add_entry(gain_loss)
                    (taxable_event, acquired_lot, taxable_event_amount, acquired_lot_amount) = _get_next_taxable_event_and_acquired_lot(
                        new_accounting_engine, taxable_event, None, taxable_event_amount - acquired_lot_amount, ZERO
                    )
        except AcquiredLotsExhaustedException:
            raise RP2ValueError("Total in-transaction crypto value < total taxable crypto value") from None
        except TaxableEventsExhaustedException:
            pass

        wallet_gain_loss_sets[wallet] = gain_loss_set

    return wallet_gain_loss_sets


def compute_tax(configuration: Configuration, accounting_engine: AccountingEngine, input_data: InputData) -> Dict[str, ComputedData]:
    Configuration.type_check("configuration", configuration)
    AccountingEngine.type_check("accounting_engine", accounting_engine)
    InputData.type_check("input_data", input_data)

    wallet_taxable_event_sets = _create_unfiltered_taxable_event_set(configuration, input_data)
    wallet_gain_loss_sets = _create_unfiltered_gain_and_loss_set(configuration, accounting_engine, input_data, wallet_taxable_event_sets)

    computed_data: Dict[str, ComputedData] = {}
    for wallet, gain_loss_set in wallet_gain_loss_sets.items():
        computed_data[wallet] = ComputedData(
            input_data.asset,
            wallet_taxable_event_sets[wallet],
            gain_loss_set,
            input_data,
            configuration.from_date,
            configuration.to_date,
        )

    return computed_data

This example assumes that the InputData class has methods like get_unfiltered_in_transaction_set(wallet), get_unfiltered_out_transaction_set(wallet), and get_unfiltered_intra_transaction_set(wallet) to retrieve transactions for a specific wallet. You will need to implement these methods and ensure that the

TransactionSet and GainLossSet classes can handle wallet-specific transactions.

By following these steps, you can modify the project to track transactions by individual wallets and calculate the cost basis based on the purchase price for the crypto asset in each wallet.

It goes on to say that AccountingEngine of course needs to be modified:

class AccountingEngine:
    def __init__(self, years_2_methods: Dict[int, str]):
        self.years_2_methods = years_2_methods
        self.wallets = {}

    def initialize(self, taxable_event_iterator: Iterator[AbstractTransaction], acquired_lot_iterator: Iterator[InTransaction], wallet: str):
        self.wallets[wallet] = {
            "taxable_event_iterator": taxable_event_iterator,
            "acquired_lot_iterator": acquired_lot_iterator,
        }

    def get_next_taxable_event_and_amount(
        self,
        wallet: str,
        taxable_event: Optional[AbstractTransaction],
        acquired_lot: Optional[InTransaction],
        taxable_event_amount: RP2Decimal,
        acquired_lot_amount: RP2Decimal,
    ) -> Tuple[AbstractTransaction, Optional[InTransaction], RP2Decimal, RP2Decimal]:
        wallet_data = self.wallets[wallet]
        taxable_event_iterator = wallet_data["taxable_event_iterator"]
        acquired_lot_iterator = wallet_data["acquired_lot_iterator"]

        # Logic to get the next taxable event and acquired lot for the specified wallet
        # ...

    def get_acquired_lot_for_taxable_event(
        self,
        wallet: str,
        taxable_event: AbstractTransaction,
        acquired_lot: Optional[InTransaction],
        taxable_event_amount: RP2Decimal,
        acquired_lot_amount: RP2Decimal,
    ) -> Tuple[AbstractTransaction, Optional[InTransaction], RP2Decimal, RP2Decimal]:
        wallet_data = self.wallets[wallet]
        acquired_lot_iterator = wallet_data["acquired_lot_iterator"]

        # Logic to get the acquired lot for the specified taxable event and wallet
        # ...

Basically, just adding wallet storage to each part of the engine. Is there something I'm missing? This is AI generated, so please go over it carefully.

@eprbell
Copy link
Owner Author

eprbell commented Nov 12, 2024

Thanks, I'll read it. I had some rough ideas on how to approach the problem:

  • do queue analysis to understand which wallet each lot ends up at after transfers;
  • do the tax analysis wallet by wallet, rather than globally (as we do now);
  • somehow normalize/unify the results in the same report.

My only problem right now is finding the time to work on it...

@macanudo527
Copy link
Collaborator

macanudo527 commented Nov 12, 2024

I think we can take it piece by piece. The first piece is to modify the initial reading in of the data to sort the different lots into different wallets. We can probably start there and build out some tests for it. I should have about the next month or so to work on and submit code. I don't think it will take that much time as long as we are pretty systematic about it.

For example, I think the first step is to create a function in tax_engine.py that sorts the transactions in to wallets. We can create the function now and write an isolated test for it and get a PR for that.

I guess this function would sort in and out transactions pretty easily, just whatever exchange they happened on. Then, intra transaction will be split into non-taxable in and out transactions in their respective wallets.

I think this handles this first step right?

  • do queue analysis to understand which wallet each lot ends up at after transfers;

Then we can cycle through the wallets probably in a multithread way to process all the transactions using the current engine. That will cover the next step:

  • do the tax analysis wallet by wallet, rather than globally (as we do now);

And finally merge all the GainLossSets for the final report. Am I missing something big?

I can probably put the code together as long as you can review it by the end of the year.

@eprbell
Copy link
Owner Author

eprbell commented Nov 13, 2024

Yes, that sounds reasonable, however I'll add a few more considerations that complicate the picture slightly:

  • Queue analysis (or transfer analysis) isn't simply about tracking lots and where they go: transferring can split a lot into parts. E.g. if I buy 1 BTC on CB and then send 0.5 BTC to a HW, I started with one lot and, after transferring, I ended up with two lots.
  • The tax engine should be able to work using either universal application or per wallet application.
  • Selection of which one to use should be left to the country plugin.
  • Additionally, some countries (like the US) support universal up to a certain year (2024 for the US), then per wallet: this should also be reflected in the country plugin.

There are additional complications such as which method to use for transfers (FIFO, LIFO, etc.). Some options:

  • Just use FIFO;
  • Same as the accounting method;
  • Let user select a method that may be different than the accounting method.

I think we should start option one or two.

I think we should write a brief high level design of this feature first: let me see if I can come up with a quick document in the weekend, to start the discussion.

@macanudo527
Copy link
Collaborator

Sorry, I guess I didn't realize until just now that this only applies to 2025, so for when we file taxes in 2026. For some reason, I thought we had to have this ready for filing taxes in 2025. I was in a panic. I guess we still have time, but if you outline what is needed I can try to have a whack at it.

@eprbell
Copy link
Owner Author

eprbell commented Nov 13, 2024

Yes, according to the Reddit thread, new rules are effective from 1/1/2025. So we have 1 year to figure it out.

@eprbell
Copy link
Owner Author

eprbell commented Nov 18, 2024

I'm making progress on the design of per-wallet application but it is still unfinished. I realized we can apply the existing accounting method layer to the new logic to pick which lot to transfer, which is nice. However we need a few additions to the existing infrastructure:

  • add add_acquired_lot() method to AbstractAcquiredLotCandidates
  • add a get_artificial_id_from_row() function to unify management of artificial transaction row id (which are negative): the new design creates one artificial InTransaction for each IntraTransaction and this new transaction needs an artificial id.

Here's the unfinished design so far. How to switch from universal to per-wallet from one year to the next is still TBD.

@macanudo527
Copy link
Collaborator

Okay, I just looked up details about Japan, and they use universal wallet and you can make use of FIFO, LIFO, etc... based on all of your holdings as a whole. So, we will have to combine universal wallet with FIFO, etc... Does your plan account for that?
Honestly, I'm okay with the current system of FIFO and universal wallet if we can't implement universal wallet and LIFO for example. But, it is something that we will probably need in the future to support all countries.

@eprbell
Copy link
Owner Author

eprbell commented Nov 20, 2024

Universal application + FIFO/LIFO/HIFO/LOFO is already supported today (you can even change from one accounting method to another year over year). See: https://github.com/eprbell/rp2/blob/main/docs/user_faq.md#can-i-change-accounting-method. Based on what you're saying it sounds like we can add more accounting methods here: https://github.com/eprbell/rp2/blob/main/src/rp2/plugin/country/jp.py#L41

The design I'm working on supports any combination of per-wallet/universal application + FIFO/LIFO/HIFO/LOFO (including changing from one combination to another year over year). This high-generality approach is proving both interesting and hard to do, so it will require some more time to iron out all the details. It's good that we can reuse the existing accounting method infrastructure for lot selection in transfers, but the problem goes deeper than I thought. When I'm closer to something presentable, we can discuss it and brainstorm a bit.

@eprbell
Copy link
Owner Author

eprbell commented Dec 1, 2024

The per-wallet application design is more or less complete. It can still be improved, but it captures most of the concepts: feel free to take a look and let me know what you think. Next I will probably do a bit of prototyping to make sure the ideas behind the design hold water.

@macanudo527
Copy link
Collaborator

@eprbell I read through it and it looks pretty sound. I'll have to give it some time and read through it again just to double check, but I think this will handle what we need. Thanks for working it out. It looks like a lot of work and you gave some good examples.

@eprbell eprbell removed the help wanted Extra attention is needed label Dec 14, 2024
@eprbell
Copy link
Owner Author

eprbell commented Dec 14, 2024

I'm making good progress on the implementation and unit testing of the transfer analysis function. Sorry, @macanudo527, I know you expressed some interest in working on this: I wanted to write a prototype to verify the design, but ended up finding more and more corner cases, and adjusting the code accordingly to deal with them. So basically what started as a prototype is becoming the real thing. I will open a PR for this though: it would be good to get your feedback before merging.

@macanudo527
Copy link
Collaborator

No worries, I can just spend time on some of the other core functions instead. Looking forward to it. I'll be out for the end of the year (Dec 23rd - Jan 7th), but after that I can look at it.

@eprbell
Copy link
Owner Author

eprbell commented Dec 16, 2024

Sounds good (we're in no rush). Have a great time during the holidays!

@eprbell
Copy link
Owner Author

eprbell commented Dec 16, 2024

US tax payers, watch this informative interview by Andreas Antonopulous on per-wallet application in the US.

@gbtorrance
Copy link

Hi. Hope you don't mind me commenting as a non-code-contributing member of the RP2 community.

I watched the Andreas Antonopulous interview. Honestly, I found it extremely difficult to follow due to him constantly interrupting the guest. I found this video by "Crypto Tax Girl" to be much more clear and easy to follow. If you have a chance to watch it, I'd be interested if you feel it lines up with your understanding of these changes.

A have a few questions if I may:

  1. Under the new per-wallet tracking with FIFO, if one transfers crypto from wallet A to wallet B, presumably the first crypto in wallet A will be transferred to wallet B. But when it gets to wallet B, where does it go in the FIFO queue? Does it automatically go to the end with a new date (i.e. the date of transfer), or does it get slotted into the queue at some point in the middle based on the original date of purchase? (Hopefully that makes sense.)
  2. There has been talk (including in both videos) about the "safe harbor" provision. I've seen two extremes for what one needs to do by the end of 2024 to comply. One is to simply declare the rules one is going to use going forward. (This is like what Crypto Tax Girl proposes.) The other is to fully document one's crypto holdings and how cost basis is allocated to each wallet. The latter approach would seem to be infeasible for RP2 users, given that the code changes are still in process, right? Is there a way -- even if somewhat manual -- to know at this point how RP2 will allocate cost basis to each wallet so that this information can be documented by the end of 2024?

Thoughts?

Thanks for all you do to make RP2 available and up-to-date! Much appreciated.

@eprbell
Copy link
Owner Author

eprbell commented Dec 19, 2024

Feedback is always welcome from anybody (code-contributing or not). Thanks for the video: I haven't watched it yet, but I will in the weekend. I think the main takeaway from Andreas' video is to follow the procedure he and his guest recommended before 12/31/24: basically consolidate everything into one wallet (if possible). The Description in the video summarizes the steps to take. This will simplify accounting going forward.

As for your questions:

  1. The current thinking is to let the user select two separate things: the transfer semantics and the accounting method. The first one is part of the new feature being developed to support per-wallet application, the second one is already available today (and doesn't change). Either of these two can be FIFO, LIFO, HIFO or LOFO. See the design document). So you could use FIFO for transfers and then HIFO as the per-wallet accounting method after transfers. However this feature is still in development and may change: take the above with a grain of salt. Again, it may be a good idea to consolidate everything before EOY as suggested by Andreas.
  2. The first feature (document each lot's cost basis) won't be supported. The second one will (see answer above).

Hope this helps.

@gbtorrance
Copy link

Thank you for the reply!

FWIW, I reviewed the design document -- twice -- but since it's very code centric, and I'm not familiar with the overall application design, I wasn't able to understand all that much about the approach being taken. (That's not a issue. It just is what it is.)

Regarding your reply, let me see if I understand:

So there are "transfer semantics" and "accounting method", each of which could be FIFO, LIFO, HIFO, or LOFO. Does that mean that if "transfer" is set to FIFO and "accounting" is set to HIFO that, when a transfer is done, the basis with the oldest date (first in) will be moved to the new wallet. And, similarly, when a token is sold within a wallet, the basis with the highest price (highest in) will be associated with the sale?

Am I understanding that correctly?

Assuming I am, I'm still not clear what happens to a token when it is transferred to another wallet and "transfer semantics" is FIFO. Does it get assign a new "in date" in the destination wallet or does it retain its original "in date" from when it was purchased?

In my original post I asked it this way:

Under the new per-wallet tracking with FIFO, if one transfers crypto from wallet A to wallet B, presumably the first crypto in wallet A will be transferred to wallet B. But when it gets to wallet B, where does it go in the FIFO queue? Does it automatically go to the end with a new date (i.e. the date of transfer), or does it get slotted into the queue at some point in the middle based on the original date of purchase?

At the time I imagined each wallet would have a "queue" of transactions, but now I understand it's probably more like a pool of transactions that can be sorted in any way at runtime, as needed, based on whether a transfer is being done ("transfer semantics") or a sell is being done ("accounting method"). Is that correct?

That being the case, I would guess that the transferred token (and corresponding basis) would retain the original purchase date even after it is moved to the destination wallet. Here's an example (a variant of yours from the design):

  • 1/1: InTransaction of 10 BTC on Coinbase
  • 2/1: InTransaction of 5 BTC on Kraken
  • 3/1: IntraTransaction of 4 BTC from Coinbase to Kraken
  • 4/1: OutTransaction of 2 BTC from Kraken

If both "transfer semantics" and "accounting method" are FIFO, does that mean that the OutTransaction on 4/1 will use the Coinbase transaction basis from 1/1 or the Kraken transaction basis from 2/1? I would assume the former. In other words, the original "in date" associated with the 4 BTC moved from Coinbase to Kraken will be retained, and when FIFO is used for the OutTransaction, 2 of those 4 BTC will be sold, since 1/1 is the new "first in" date of the Kraken wallet (even though, technically, the 5 BTC bought on Kraken on 2/1 were "first in" prior to the transfer on 3/1 having occurred).

Does that make any sense? Hopefully. Thoughts?

As for the second question, I'm not sure what you mean by this:

  1. The first feature (document each lot's cost basis) won't be supported. The second one will (see answer above).

I understand that, per Andreas, it probably makes sense to try and consolidate wallets as much as possible. But are you going to actually make a "declaration" before 1/1/25 and either email it to yourself or use the blockchain time-stamping approach Andreas suggested to have something that can be provided to the IRS, if needed, as proof of claiming "safe harbor"? And, if so, what is that declaration going to contain?

Thanks!

@eprbell
Copy link
Owner Author

eprbell commented Dec 20, 2024

I'm replying inline, however keep in mind that what I'm saying is based on my current ideas for a design that is still in flux. So don't make tax decisions solely based on what I'm describing here, because it may change. This is why I was highlighting Andreas' solution: it makes your crypto accounting simple and clear, regardless of what happens with tax software.

So there are "transfer semantics" and "accounting method", each of which could be FIFO, LIFO, HIFO, or LOFO. Does that mean that if "transfer" is set to FIFO and "accounting" is set to HIFO that, when a transfer is done, the basis with the oldest date (first in) will be moved to the new wallet. And, similarly, when a token is sold within a wallet, the basis with the highest price (highest in) will be associated with the sale?

Yes to both questions. The transfer semantics is what is used to populate per-wallet queues from the universal queue that is used up to the end of 2024 (it is also used when transferring from one per-wallet queue to another).

Assuming I am, I'm still not clear what happens to a token when it is transferred to another wallet and "transfer semantics" is FIFO. Does it get assign a new "in date" in the destination wallet or does it retain its original "in date" from when it was purchased?

Good question. The current idea is to create an artificial InTransaction in the "to" wallet. This artificial InTransaction has:

  • timestamp: same as the IntraTransaction;
  • crypto_in: minimum of IntraTransaction crypto_received and remaining amount in the from lot that was selected with transfer semantics;
  • spot_price: same as the from lot that was selected with transfer semantics.

Under the new per-wallet tracking with FIFO, if one transfers crypto from wallet A to wallet B, presumably the first crypto in wallet A will be transferred to wallet B. But when it gets to wallet B, where does it go in the FIFO queue? Does it automatically go to the end with a new date (i.e. the date of transfer), or does it get slotted into the queue at some point in the middle based on the original date of purchase?

I think this was already answered above. It goes into a new queue that is specific to wallet B (that's the whole idea behind per-wallet application). However you need to consider the case in which you have 1 BTC in wallet A and you transfer 0.5 BTC to wallet B. In this case you're splitting the original lot. Currently this is captured by leaving the 1 BTC in the queue of wallet A and creating an artificial transaction for 0.5 BTC in wallet B. The two transactions are linked and are updated together by the tax engine (for more on this check this).

At the time I imagined each wallet would have a "queue" of transactions, but now I understand it's probably more like a pool of transactions that can be sorted in any way at runtime, as needed, based on whether a transfer is being done ("transfer semantics") or a sell is being done ("accounting method"). Is that correct?

Not quite: see explanation above about one queue per wallet.

That being the case, I would guess that the transferred token (and corresponding basis) would retain the original purchase date even after it is moved to the destination wallet. Here's an example (a variant of yours from the design):
* 1/1: InTransaction of 10 BTC on Coinbase
* 2/1: InTransaction of 5 BTC on Kraken
* 3/1: IntraTransaction of 4 BTC from Coinbase to Kraken
* 4/1: OutTransaction of 2 BTC from Kraken
If both "transfer semantics" and "accounting method" are FIFO, does that mean that the OutTransaction on 4/1 will use the Coinbase transaction basis from 1/1 or the Kraken transaction basis from 2/1? I would assume the former. In other words, the original "in date" associated with the 4 BTC moved from Coinbase to Kraken will be retained, and when FIFO is used for the OutTransaction, 2 of those 4 BTC will be sold, since 1/1 is the new "first in" date of the Kraken wallet (even though, technically, the 5 BTC bought on Kraken on 2/1 were "first in" prior to the transfer on 3/1 having occurred).

In your example, using FIFO for everything, the 4/1 OutTransaction would use the 2/1 InTransaction. Note that the Kraken queue would also have an arificial InTransaction on 3/1 containing 4 BTC and linked to the 1/1 transaction. But in your example the artificial transaction is not exercised because the 2/1 transaction has enough funds to cover the 2 BTC of the OutTransaction.

If the OutTransaction had, say, 7 BTC instead of 2, then the code would use first the entire 2/1 lot and then 2 BTC from the artficial InTransaction (this also causes its parent transaction 1/1 to be updated as explained here).

As for the second question, I'm not sure what you mean by this:

  1. The first feature (document each lot's cost basis) won't be supported. The second one will (see answer above).

I mean that RP2 won't let the user select which lot goes into which wallet queue arbitrarily. RP2 will take an algorithmic approach: the user selects the transfer semantics and the code moves the funds around.

I understand that, per Andreas, it probably makes sense to try and consolidate wallets as much as possible. But are you going to actually make a "declaration" before 1/1/25 and either email it to yourself or use the blockchain time-stamping approach Andreas suggested to have something that can be provided to the IRS, if needed, as proof of claiming "safe harbor"? And, if so, what is that declaration going to contain?

This is probably a question for your tax advisor, but the rough idea is to move everything to one single wallet and then take snapshots of all accounts, generate an RP2 report and timestamp everything on Dec 31st. By moving to a single wallet you're essentially causing universal and per-wallet approach to match, because there is now only one wallet having one queue with all the funds.

Thanks for asking questions and engaging in conversation. It's good to brainstorm the ideas behind the design and see if they hold water.

@gbtorrance
Copy link

gbtorrance commented Dec 20, 2024

Thanks for the reply!

Agreed about the conversation and brainstorming. Even though I'm not coding this, I find it very helpful for my own understanding.

If both "transfer semantics" and "accounting method" are FIFO, does that mean that the OutTransaction on 4/1 will use the Coinbase transaction basis from 1/1 or the Kraken transaction basis from 2/1? I would assume the former. In other words, the original "in date" associated with the 4 BTC moved from Coinbase to Kraken will be retained, and when FIFO is used for the OutTransaction, 2 of those 4 BTC will be sold, since 1/1 is the new "first in" date of the Kraken wallet (even though, technically, the 5 BTC bought on Kraken on 2/1 were "first in" prior to the transfer on 3/1 having occurred).

In your example, using FIFO for everything, the 4/1 OutTransaction would use the 2/1 InTransaction. Note that the Kraken queue would also have an arificial InTransaction on 3/1 containing 4 BTC and linked to the 1/1 transaction. But in your example the artificial transaction is not exercised because the 2/1 transaction has enough funds to cover the 2 BTC of the OutTransaction.

This was a surprise to me, so I decided to post basically this exact question on the Reddit forum to see if @JustinCPA would respond, which he did. He seems to say the opposite of what you've said.

image

I feel like this could be an issue for the current design -- at least for US taxpayers, and assuming @JustinCPA is correct.

Thoughts?

@eprbell
Copy link
Owner Author

eprbell commented Dec 21, 2024

Sounds like you found a bug in the design! Thanks for getting in the weeds and asking JustinCPA: his explanation is convincing. The bug is that the artificial InTransaction was using the timestamp of the transfer instead of the timestamp of the from InTransaction. I already fixed the code so that the behavior is as explained by Justin. I will be posting some initial code for the transfer analysis algorithm in a PR soon (together with unit tests).

@gbtorrance
Copy link

gbtorrance commented Dec 21, 2024

Great! Thanks!

One thing that comes to mind: You may well be taking care of this already, so forgive me if you are, but you may want to make sure that transfers, if they are using the timestamp of the "from" InTransaction, do not inadvertently allow other invalid transactions to occur. An example is probably needed to explain:

  • 1/1: InTransaction of 10 BTC on Coinbase
  • 2/1: OutTransaction of 1 BTC from Kraken (!! INVALID !!)
  • 3/1: IntraTransaction of 3 BTC from Kraken to Binance (!! INVALID !!)
  • 4/1: IntraTransaction of 5 BTC from Coinbase to Kraken

Of course the transactions on 2/1 and 3/1 seem obviously invalid when looked at like that. But, I could imagine a scenario where the 5 BTC transferred on 4/1 from Coinbase to Kraken inherit the "from" InTransaction date of 1/1, making the Kraken OutTransaction on 2/1 and IntraTransaction on 3/1 seem valid. But obviously they're not (because the 5 BTC has not been transferred yet). Make sense?

Keep in mind I don't understand the overall app design or the design for these changes. I'm just looking at this as an outsider.

Thanks!

@gbtorrance
Copy link

So there are "transfer semantics" and "accounting method", each of which could be FIFO, LIFO, HIFO, or LOFO. Does that mean that if "transfer" is set to FIFO and "accounting" is set to HIFO that, when a transfer is done, the basis with the oldest date (first in) will be moved to the new wallet. And, similarly, when a token is sold within a wallet, the basis with the highest price (highest in) will be associated with the sale?

Yes to both questions. The transfer semantics is what is used to populate per-wallet queues from the universal queue that is used up to the end of 2024 (it is also used when transferring from one per-wallet queue to another).

One more thought: Is it possible to use different transfer semantics to:

  • populate per-wallet queues from the universal queue, and,
  • transfer from one wallet to another in 2025 and beyond?

Don't think that I would need this, but -- if I'm understanding correctly -- others may. Take a look at this post for context. Basically, I think some may want to use HIFO for populating the wallets from the universal queue, and then FIFO going forward (as I understand that is required for "global allocation" / non-spec-ID).

image

@eprbell
Copy link
Owner Author

eprbell commented Dec 21, 2024

One thing that comes to mind: You may well be taking care of this already, so forgive me if you are, but you may want to make sure that transfers, if they are using the timestamp of the "from" InTransaction, do not inadvertently allow other invalid transactions to occur. An example is probably needed to explain:

* 1/1: `InTransaction` of 10 BTC on Coinbase

* 2/1: `OutTransaction` of 1 BTC from Kraken (!! INVALID !!)

* 3/1: `IntraTransaction` of 3 BTC from Kraken to Binance (!! INVALID !!)

* 4/1: `IntraTransaction` of 5 BTC from Coinbase to Kraken

Of course the transactions on 2/1 and 3/1 seem obviously invalid when looked at like that. But, I could imagine a scenario where the 5 BTC transferred on 4/1 from Coinbase to Kraken inherit the "from" InTransaction date of 1/1, making the Kraken OutTransaction on 2/1 and IntraTransaction on 3/1 seem valid. But obviously they're not (because the 5 BTC has not been transferred yet). Make sense?

Ah, good point. This means that my previous approach was only 50% wrong :-) because the artificial transaction needs both timestamps: one for holding period and the other for fund availability. Let me think a bit on how to best model this: we probably need a subclass of InTransaction to capture this.

Keep in mind I don't understand the overall app design or the design for these changes. I'm just looking at this as an outsider.

No worries: your feedback as a user has been very valuable. Keep it coming!

@eprbell
Copy link
Owner Author

eprbell commented Dec 22, 2024

So there are "transfer semantics" and "accounting method", each of which could be FIFO, LIFO, HIFO, or LOFO. Does that mean that if "transfer" is set to FIFO and "accounting" is set to HIFO that, when a transfer is done, the basis with the oldest date (first in) will be moved to the new wallet. And, similarly, when a token is sold within a wallet, the basis with the highest price (highest in) will be associated with the sale?

Yes to both questions. The transfer semantics is what is used to populate per-wallet queues from the universal queue that is used up to the end of 2024 (it is also used when transferring from one per-wallet queue to another).

One more thought: Is it possible to use different transfer semantics to:

* populate per-wallet queues from the universal queue, and,

* transfer from one wallet to another in 2025 and beyond?

Don't think that I would need this, but -- if I'm understanding correctly -- others may. Take a look at this post for context. Basically, I think some may want to use HIFO for populating the wallets from the universal queue, and then FIFO going forward (as I understand that is required for "global allocation" / non-spec-ID).

image

Interesting: the current design allows for changing transfer semantics and accounting method year over year in any combination supported by the country plugin. What you're describing would be an extra one-time only transfer semantics for initial per-wallet queue population: this is not supported yet. With the existing design you could select HIFO transfer semantics in 2025 and then switch to FIFO in following years: not exactly what you're asking for but it's an approximation.

I think we should finish the basic design and implementation of per-wallet application and then we can think about something like this as a potential advanced feature.

@gbtorrance
Copy link

I think we should finish the basic design and implementation of per-wallet application and then we can think about something like this as a potential advanced feature.

Of course. Just figured I'd mention it for your awareness. (I personally don't anticipate needing it.)

With the existing design you could select HIFO transfer semantics in 2025 and then switch to FIFO in following years: not exactly what you're asking for but it's an approximation.

I suppose, though I don't think that would comply with the IRS requirements. I'm no expert, but my understanding is that it has to either be FIFO or specific identification going forward.

I totally get your point, though. "One step at a time" :-)

@gbtorrance
Copy link

gbtorrance commented Dec 30, 2024

I would say: let's assume the date doesn't change unless we have overwhelming evidence to the contrary.

I think that is the safest approach. Almost certainly it will be accepted by the IRS. (The other approach, I'm not so sure.)

@gbtorrance
Copy link

This should cover all the methods described by Justin (and more). Let me know if you find any holes in this logic.

I think this makes sense.

I do have a bit of concern that the IRS may not like the custom wallet ordering. I posted a question to Justin about this, and will let you know what he says. Hopefully it's allowed. But if it is used, it does put a bit more burden on the user to make sure they clearly and correctly document their chosen Global Allocation Method, and that it is aligned with the configured wallet ordering in RP2.

Transfer analysis is performed but instead of passing the per-wallet InTransactions that have been discovered so far by the transfer analysis algorithm, we pass the per-wallet subqueue from the previous bullet (and using FIFO because the subqueue is already sorted).

Can you elaborate a bit on how wallet queues are used in RP2? It seems to me you should be able to use any Accounting Method for the Global Allocation (to determine what lots go into what wallets). But then you should also be able to choose any Accounting Method for subsequent transfer processing (which presumably includes processing of sales). Important note: For the Accounting Method for transfer processing, it is understood that U.S. taxpayers would likely have to use FIFO going forward. The flexibility to choose another method would be for non-U.S. taxpayers.

To restate the question: Can you clarify how sorted ordering of queues plays into the ability to select any Accounting Method for any tax year (for transfer processing)?


On a different subject, I was thinking that some of the Global Allocation Methods suggested by Justin can be open to some confusion and, if they are going to be handled directly by RP2 (rather than by a custom wallet ordering), it's worth making sure everyone is one the same page.

What does wallet "highest balance" mean? Highest balance of the particular coin or highest balance of all coins in the wallet? I would assume of the particular coin being processed. (It doesn't really make sense otherwise.)

What does "most active wallet" mean? Is it based on number of transactions or recency of transactions? Is it number of transactions for the particular coin or for all coins? Is it number of transactions since inception of the wallet or number of transactions in a more recent time period? (If I had to guess I'd say number of transactions since inception for the particular coin being processed.)

Just throwing that out for consideration. RP2 implementing the wallet selection deterministically does require being clear on what the Global Allocation Methods actually mean. But if RP2 just supports custom wallet selection order, then it's not really an issue. It would be up to the user.

@gbtorrance
Copy link

This should cover all the methods described by Justin (and more).

Now that I think about it, there is a potential issue. It depends on one's interpretation of the Global Allocation methods.

If, for example, one interprets wallet "highest balance" as meaning "highest balance of the particular coin", then having a single custom ordering of wallets in RP2 would not allow this method to be properly replicated. Wallet A may have the highest balance of Token X, but Wallet B may have the highest balance of Token Y. It's possible to manually order the wallets to guarantee "Lowest cost basis to highest balance" for Token X or for Token Y, but not for both. Make sense?

Sorry to always be the bearer of bad news :-(

Maybe having a deterministic wallet allocation is going to be required?

@gbtorrance
Copy link

Sorry to always be the bearer of bad news :-(

On the assumption that custom wallet ordering will not be suitable due to "highest balance of the particular coin" issue referenced above... do you think it's comfortably doable to implement deterministic ordering of wallets by per-coin balance for the Global Allocation? If so, I'd like to go ahead and create a Safe Harbor declaration as follows: "Highest Cost Basis to Highest Balance" (or maybe highest to lowest or lowest to highest; not 100% sure yet). Just want to make sure you foresee no issues with implementation.

Thoughts? Thanks!

@gbtorrance
Copy link

gbtorrance commented Dec 30, 2024

FWIW, Justin confirmed that he believes custom wallet ordering would be acceptable. It's unfortunate about the per-coin issue.

https://www.reddit.com/r/CryptoTax/comments/1hk31yd/comment/m4i60ew/
image

@gbtorrance
Copy link

Sorry for all the messages. I'm just thinking outloud/online here.

I suppose custom wallet ordering would be an acceptable approach for RP2, on the understanding that it wouldn't be able to replicated the more generic Global Allocation Methods Justin mentioned.

Whatever you think is best. Please just let me know when you can so I can make an appropriate declaration before EOY. Thanks.

@eprbell
Copy link
Owner Author

eprbell commented Dec 30, 2024

Sorry to always be the bearer of bad news :-(

On the assumption that custom wallet ordering will not be suitable due to "highest balance of the particular coin" issue referenced above... do you think it's comfortably doable to implement deterministic ordering of wallets by per-coin balance for the Global Allocation? If so, I'd like to go ahead and create a Safe Harbor declaration as follows: "Highest Cost Basis to Highest Balance" (or maybe highest to lowest or lowest to highest; not 100% sure yet). Just want to make sure you foresee no issues with implementation.

Thoughts? Thanks!

Yes, per-coin ordering of wallets should be doable. Perhaps we can have a default order that can be optionally overridden on a per per-coin basis.

@gbtorrance
Copy link

Yes, per-coin ordering of wallets should be doable.

Great! Thanks for confirming. I'll assume the ability to order wallets by holding of each coin for Global Allocation when filling out my Safe Harbor declaration.

Perhaps we can have a default order that can be optionally overridden on a per per-coin basis.

Sounds good.

@eprbell
Copy link
Owner Author

eprbell commented Jan 13, 2025

I opened a PR with the first implementation of the transfer analysis algorithm: see #138. @macanudo527, PTAL when you get a chance and let me know if you find any issues.

The unit tests use Go style, table-driven format: this format is far superior to custom code and makes tests very easy to write, read and maintain: even non-programmers could potentially read them and contribute ideas for new ones. The idea is that each unit test is a described by a _Test structure containing an input and a want fields: the first is a set of input transactions, and the the second is the result we expect. There is also a want_error field, which is mutually exclusive to want: when it is specified it means that want cannot be specified and that the text is expected to fail with the given message.

See the following files in the PR:

  • tests/test_per_wallet_tax_engine_semantics_dependent.py
  • tests/test_per_wallet_tax_engine_semantics_independent.py

@gbtorrance I added the scenarios you brought up (look for "This test is from the discussion at" in the two files above to find them). If you have any other scenarios in mind that you think should be covered, let me know.

@eprbell
Copy link
Owner Author

eprbell commented Jan 13, 2025

Next I'll start working on Global Allocation, which is built on top on transfer analysis, as discussed here.

@eprbell
Copy link
Owner Author

eprbell commented Jan 13, 2025

I have one question on global allocation. Imagine the following scenario:

  • I choose HIFO with two accounts and the account order is Coinbase, then Kraken;
  • Highest cost buys happened on Kraken and are also the most recent ones;
  • The earliest sales happened on Coinbase and are also the oldest ones (in particular Coinbase sales are earlier than Kraken high-cost buys).

When switching from universal to per-wallet application the first thing that runs is global allocation (to reassign cost bases) and the result is that early sales on Coinbase (which is first in the order) are paired with high cost Kraken buys which happened after the sales. The question is how to handle this? Normally pairing a sale with a future buy would not be allowed. There are two options AFAICT (described below).

To clarify further here's a detailed example:

  • 1/23 Buy 2 BTC on CB, spot=$10K
  • 2/23 Sell 1 BTC on CB, spot=$20K
  • 1/24 Buy 3 BTC on Kraken, spot=$50K
  • 2/24 Sell 1 BTC on CB, spot=$60K

Option 1)
When running global allocation with HIFO/CB,Kraken, in-lots are ordered as 1/24, 1/23. Then, since the order is CB first, the 2/23 sale gets paired with the 1/24 buy. So the cost basis would be in the future for this sale.

Option 2)
However we could impose that HIFO is applied only to in lots that occurred before the sale (which seems to make more sense to me and is how accounting methods work in RP2 today), so results would be:

  • the 2/23 sale is paired with 1 BTC from the 1/23 buy
  • the 2/24 sale is paired with 1 BTC from the 1/24 buy

I'm leaning toward option 2: any objections? CC: @gbtorrance.

@eprbell
Copy link
Owner Author

eprbell commented Jan 14, 2025

Here are my initial thoughts on how global allocation would work in RP2. Global allocation is called only when the user switches from universal application to per-wallet application (note that global allocation != universal application). After that normal transfer_analysis is used.

Global allocation receives as input:

  • universal application transactions (i.e. single-queue data),
  • allocation method (FIFO, LIFO, HIFO, etc.),
  • list of accounts, representing user-specified account order.

It outputs a Dict Account->InputData (just like transfer_analysis), representing per-wallet input data, but patched to reflect the new cost basis assignments from the allocation method.

It is built on top of transfer_analysis and it works as follows:

  • perform a normal transfer_analysis using FIFO transfer semantics. This generates a Dict Account->InputData. Note that the FIFO transfer semantics doesn't matter because it will be overwritten by the steps below (which have to patch the output of transfer analysis to reflect the cost bases from universal allocation. We use FIFO because it's faster than the others: O(n) vs O(n*log(n);
  • loop over the list of accounts:
    • get the InputData corresponding to the current account from the output of transfer analysis;
    • compute the balance for this account and store it: these are the amounts of funds that need their cost basis reassigned via universal allocation;
    • remove all artificial InTransactions (because they reflect FIFO transfer semantics) and clear all from_lot, to_lots and originates_from fields from non-artificial InTransactions.
  • loop again over the list of accounts:
    • get the InputData corresponding to the current account from the output of transfer analysis;
    • retrieve the balance for this account;
    • get acquired lots from the universal queue using the allocation method until we have enough lots to cover the balance;
    • for each acquired lot create an artificial InTransaction and adjust from_lot, to_lots and originates_from so that the acquired lot and the artificial InTransaction point to each other properly (as they would if they were the result of a zero-fee transfer).

Essentially this patches the InputData structures that result from transfer analysis so that the cost basis are allocated based on the allocation method rather than on transfers.

@gbtorrance
Copy link

@eprbell, first of all, so sorry for taking so long to reply. About a week ago I got a whole bunch of RP2 notification emails from Github and figured I'd look at it later ... but then, before I knew it, a week had passed. My apologies!

1/23 Buy 2 BTC on CB, spot=$10K

Just want to be clear about your convention. "1/23" here means "the first transaction of 2023", right?

And also to be clear, it's "universal application", "per-wallet application", and "global allocation", right? (I find the terms so confusing, but I'll try to go with this for now. Hopefully I'm not confusing matters.)

When I switch from universal to per-wallet allocation I run global allocation and the result is that early sales on Coinbase (which is first in the order) are paired with high cost Kraken buys which happened after the sales.

I'm confused by the talk of pairing buys and sells. (I'll get back to that later...)

You're talking about switching from universal to per-wallet application and running global allocation. "Global allocation" refers to the one-off processing that runs on all pre-2025 transactions to get them ready for per-wallet application, right? (Assuming U.S. users and rules.)

This assumes that all pre-"global allocation" processing has already completed, right?

So it's something like this (for U.S. users):

  1. Run universal application processing on all transactions pre-2025.
  2. Run global allocation to allocate unused cost basis to wallets.
  3. Run per-wallet application on all transactions 2025 and later.

Right?

The question is how to handle this? Normally pairing a sale with a future buy would not be allowed.

I think I agree that it shouldn't be allowed, but I want to make sure I understand what you mean by "pairing" buys and sells.

Are you talking about what happens in steps 1 and 3, where the basis attributed to a sale is determined by matching the sale with a buy using the chosen accounting method (presumably HIFO in your example). This being the case, I would definitely say that only earlier buys should be considered for matching. (Otherwise we're getting into dangerous Back To The Future territory.)

Or are you talking about what happens in step 2, where unused basis (as at EOY 2024) is "globally allocated" to the wallets in the specified order using the specified accounting method (again, presumably HIFO for this example)? (I don't think you're referring to this as "pairing", but I could be wrong.)

I think I need to try and write up the example in more detail to see how it shakes out. Bear with me...

Here's your example as a starting point.

1/23 Buy 2 BTC on CB, spot=$10K
2/23 Sell 1 BTC on CB, spot=$20K
1/24 Buy 3 BTC on Kraken, spot=$50K
2/24 Sell 1 BTC on CB, spot=$60K

In Step 1 ("universal application") with HIFO the 2/23 sell on CB should be paired with the 1/23 buy on CB for the purpose of determining the cost basis attributed to the sell. (No others transactions have occurred yet, and sells have to be paired with earlier buys.) Then the 2/24 sell on CB should be paired with the 1/24 buy on Kraken, as that has the highest unused basis at that point in time.

What remains for unused basis should be the following:

1 x BTC, spot=$10K, originally bought 1/23 on CB
2 x BTC, spot=$50K, originally bought 1/24 on Kraken

What remains for actual coins on actual wallets should be the following:

0 x BTC on CB
3 x BTC on Kraken

Now it's time to run Step 2 ("global allocation"). You've said you want to allocate first to CB, then Kraken, right? There are no coins remaining on CB, so all unused basis should go to the coins on Kraken. But to step this through methodically (assuming HIFO as the accounting method) you'd first allocated the 2 x BTC bought for $50K to two of the coins on Kraken, and then you'd allocate the 1 x BTC bought for $10k to the remaining coin on Kraken.

As I understand it, the original purchase dates and basis should remain associated in a fixed relationship with one another. Even though the 3 coins that remain on Kraken were purchased on 1/24, once step 2 ("global allocation") is complete, that date is essentially irrelevant. The dates associated with the coins post-"global allocation" should be the dates associated with the unused basis that has been allocated to that wallet.

So, after global allocation, Kraken should essentially hold the following:

1 x BTC, spot=$10K, bought 1/23 (the fact it was bought on CB is now irrelevant and ancient history)
2 x BTC, spot=$50k, bought 1/24 (this was coincidentally bought on Kraken, but again, not relevant)

For step 3 ("per-wallet application"), say for example the user wanted to use FIFO (which may be a requirement for U.S. users) as the accounting method. At this point if there were a sell of 2 x BTC on 1/25, the first basis to be used would be from the 1 x BTC at $10k as it's "first in" at 1/23. And the second basis to be used would be from the 2 x BTC at $50K as it is "second in" at 1/24.

What would remain in the Kraken wallet subsequent the 1/25 sell would be:

1 x BTC, spot=$50k, bought 1/24

Is this making any sense? Thoughts?

@gbtorrance
Copy link

Here are my initial thoughts on how universal allocation would work in RP2. Universal allocation is called only when the user switches from universal application to per-wallet application (note that universal allocation != universal application).

I'm super confused about this. Don't you mean "global allocation" and "universal application"? Having both "universal allocation" and "universal application" is not going to end well ;-)

Honestly, haven't read the rest of this particular post in much detail, as I don't understand the RP2 design, so most of the references go over my head.

@eprbell
Copy link
Owner Author

eprbell commented Jan 21, 2025

No need for apologies: we're all volunteers here, so any contribution is appreciated, but without any pressure or obligation! Thanks for asking to clarify terminology: I'm guilty of mixing up these confusing safe harbor terms sometimes. Answers inline below.

Just want to be clear about your convention. "1/23" here means "the first transaction of 2023", right?

Yes, it means January 2023.

And also to be clear, it's "universal application", "per-wallet application", and "global allocation", right? (I find the terms so confusing, but I'll try to go with this for now. Hopefully I'm not confusing matters.)

Correct: I went back and corrected my earlier replies in which I mixed them up a bit.

You're talking about switching from universal to per-wallet application and running global allocation. "Global allocation" refers to the one-off processing that runs on all pre-2025 transactions to get them ready for per-wallet application, right? (Assuming U.S. users and rules.)

Yes, all correct.

This assumes that all pre-"global allocation" processing has already completed, right?

So it's something like this (for U.S. users):
1. Run universal application processing on all transactions pre-2025.
2. Run global allocation to allocate unused cost basis to wallets.
3. Run per-wallet application on all transactions 2025 and later.
Right?

Yes, that's the idea, with step 2 only occurring when switching from universal application to per-wallet application.

The question is how to handle this? Normally pairing a sale with a future buy would not be allowed.

I think I agree that it shouldn't be allowed, but I want to make sure I understand what you mean by "pairing" buys and sells.

Are you talking about what happens in steps 1 and 3, where the basis attributed to a sale is determined by matching the sale with a buy using the chosen accounting method (presumably HIFO in your example). This being the case, I would definitely say that only earlier buys should be considered for matching. (Otherwise we're getting into dangerous Back To The Future territory.)

Or are you talking about what happens in step 2, where unused basis (as at EOY 2024) is "globally allocated" to the wallets in the specified order using the specified accounting method (again, presumably HIFO for this example)? (I don't think you're referring to this as "pairing", but I could be wrong.)

I think I need to try and write up the example in more detail to see how it shakes out. Bear with me...

Here's your example as a starting point.

1/23 Buy 2 BTC on CB, spot=$10K
2/23 Sell 1 BTC on CB, spot=$20K
1/24 Buy 3 BTC on Kraken, spot=$50K
2/24 Sell 1 BTC on CB, spot=$60K

I should have clarified better. The example I gave is what is the universal data that is generated after running step 1 and before feeding it to step 2. However the "pairing" I talked about refers to step 3.

Does this make sense? Let me know if anything is still unclear.

@gbtorrance
Copy link

Does this make sense? Let me know if anything is still unclear.

Thanks for clarifying @eprbell. But I still feel unsure about whether we're "on the same page".

Would you mind responding to the second part of my post, beginning "In Step 1 ("universal application") with HIFO ...", where I walk through the detailed example of what would occur in the 3 steps. I want to make sure you agree and, if you don't agree on any points, I feel we should hash out "why". If I'm misunderstanding, I want to make sure I can correct that misunderstanding.

Also, are we in agreement on the following?

  • Steps 1 ("universal application"), 2 ("global allocation"), and 3 ("per-wallet application") can all have different accounting methods (e.g. HIFO, LIFO, FIFO, etc.), configurable by the user.
  • The accounting method for U.S. taxpayers for step 3 ("per-wallet application") will probably need to be FIFO, but that will be configurable, and may change based on upcoming IRS guidance.
  • Pairing of transactions in steps 1 and 3 will always pair sell transactions with earlier buy transactions for the purpose of assigning unused basis to the sell transactions.
  • But, to further clarify the point immediately above, in step 3 ("per-wallet application"), the dates used for determining whether buy transactions are earlier than sell transactions should be the dates associated with the basis allocated to the wallet in step 2 ("global allocation"), not the dates when the "physical" coins in that wallet were originally purchased. (As I understand it, those dates are now irrelevant, because global allocation has "re-written" the coins with new dates and associated basis.)

Something else I'd like to clarify (related to the "earlier date" thing). Forgive me if this is obvious, and already handled this way, but I'm thinking about it and just want to be sure: When processing steps 1 and 3, my assumption is that transactions should be processed in date order (earliest to latest), regardless of the accounting method used, and any transactions that occurred after the transaction currently being processed should be "invisible" and entirely ignored. For example, if the accounting method is HIFO, and there is a sell on 5/25, the effective HIFO queue of unused basis should only included transactions that occurred prior to the 5/25 transaction for the purpose of assigning basis to the 5/25 sell. As I understand it, this is super important, because if you run RP2 on a set of transactions in the middle of the year, it should allocate basis to sell transactions (that have occurred up to that point) in exactly the same way as if you run it at the end of the year (when additional buy transactions may have occurred). Stated another way, I don't believe it would be correct to build a HIFO queue of all transactions (essentially "basis lots") for a wallet, and then pull from that queue when processing individual sell transactions and ignore date. Sell transactions should only use unused basis with an earlier date, regardless of accounting method.

My writing can be a bit "stream of consciousness" and, in retrospect, I could probably have structured the above a bit more logically. But hopefully it makes sense.

Thoughts?

@eprbell
Copy link
Owner Author

eprbell commented Jan 24, 2025

Would you mind responding to the second part of my post, beginning "In Step 1 ("universal application") with HIFO ...", where I walk through the detailed example of what would occur in the 3 steps. I want to make sure you agree and, if you don't agree on any points, I feel we should hash out "why". If I'm misunderstanding, I want to make sure I can correct that misunderstanding.

I am about to leave for a few hours so I don't have time to answer this right now, but I will in the next day or two in a separate message. Meanwhile I answered your other questions below inline.

* Steps 1 ("universal application"), 2 ("global allocation"), and 3 ("per-wallet application") can all have different accounting methods (e.g. HIFO, LIFO, FIFO, etc.), configurable by the user.

Correct.

* The accounting method for U.S. taxpayers for step 3 ("per-wallet application") will probably need to be FIFO, but that will be configurable, and may change based on upcoming IRS guidance.

Correct.

* Pairing of transactions in steps 1 and 3 will always pair sell transactions with _earlier_ buy transactions for the purpose of assigning unused basis to the sell transactions.

Correct: that's also how RP2 works today. And I think it should work the same way for steps 1, 2 and 3.

* But, to further clarify the point immediately above, in step 3 ("per-wallet application"), the dates used for determining whether buy transactions are _earlier_ than sell transactions should be the dates associated with the basis allocated to the wallet in step 2 ("global allocation"), _not_ the dates when the "physical" coins in that wallet were originally purchased. (As I understand it, those dates are now irrelevant, because global allocation has "re-written" the coins with new dates and associated basis.)

I'm not sure if we understand this in the same way. The way I was planning to do this is (still working on it so it may change):

  • global allocation would be modeled by adding a number of artificial intra-transactions to the universal application data set: these transfers define in detail what the allocation looks like (as if the user actually transferred funds across wallets using FIFO, LIFO, HIFO or LOFO).
  • then during per-wallet application, each intra-transaction (including the artificial ones above) is handled as follows:
    • create a new artificial in-transaction to the "to" per-wallet data set, modeling the reception of funds. This artificial in-transaction has the same spot price (and therefore cost basis as the originating funds), the timestamp of the intra-transaction and an additional field called cost_basis_timestamp, containing the timestamp of the original transaction. So the new artificial in-transactions have both the timestamp at which the transfer occured and the one at which the funds were originally acquired. The cost_basis_timestamp is important to distinguish long from short-term capital gains.

Something else I'd like to clarify (related to the "earlier date" thing). Forgive me if this is obvious, and already handled this way, but I'm thinking about it and just want to be sure: When processing steps 1 and 3, my assumption is that transactions should be processed in date order (earliest to latest), regardless of the accounting method used, and any transactions that occurred after the transaction currently being processed should be "invisible" and entirely ignored. For example, if the accounting method is HIFO, and there is a sell on 5/25, the effective HIFO queue of unused basis should only included transactions that occurred prior to the 5/25 transaction for the purpose of assigning basis to the 5/25 sell. As I understand it, this is super important, because if you run RP2 on a set of transactions in the middle of the year, it should allocate basis to sell transactions (that have occurred up to that point) in exactly the same way as if you run it at the end of the year (when additional buy transactions may have occurred). Stated another way, I don't believe it would be correct to build a HIFO queue of all transactions (essentially "basis lots") for a wallet, and then pull from that queue when processing individual sell transactions and ignore date. Sell transactions should only use unused basis with an earlier date, regardless of accounting method.

Right, this is the same as your third bullet above, I believe, and it's correct: this is how RP2 behaves today and I think universal application, global allocation and per-wallet application should continue to behave in the same way.

@gbtorrance
Copy link

gbtorrance commented Jan 24, 2025

I am about to leave for a few hours so I don't have time to answer this right now, but I will in the next day or two in a separate message.

Cool. Thanks!

Pairing of transactions in steps 1 and 3 will always pair sell transactions with earlier buy transactions

Correct: that's also how RP2 works today. And I think it should work the same way for steps 1, 2 and 3.

and I think universal application, global allocation and per-wallet application should continue to behave in the same way.

FWIW, though I don't think there is any harm in implicitly including step 2 in this, as I understand it, step 2 doesn't really have the same logic as steps 1 and 3, as there is no pairing of buy and sell transactions in step 2 and, therefore, no requirement to enforce that "only earlier dates are considered". Step 2 is simply using the accounting method to order transactions for global allocation to wallets. (If I'm misunderstanding, please let me know.)

I'm not sure if we understand this in the same way. The way I was planning to do this is (still working on it so it may change):

  • global allocation would be modeled by adding a number of artificial intra-transactions to the universal application data set: these transfers define in detail what the allocation looks like (as if the user actually transferred funds across wallets using FIFO, LIFO, HIFO or LOFO).
  • then during per-wallet application, each intra-transaction (including the artificial ones above) is handled as follows:
    • create a new artificial in-transaction to the "to" per-wallet data set, modeling the reception of funds. This artificial in-transaction has the same spot price (and therefore cost basis as the originating funds), the timestamp of the intra-transaction and an additional field called cost_basis_timestamp, containing the timestamp of the original transaction. So the new artificial in-transactions have both the timestamp at which the transfer occured and the one at which the funds were originally acquired. The cost_basis_timestamp is important to distinguish long from short-term capital gains.

I hope you don't mind me being super "forward" here, but since you've always seemed very open to feedback, I'll press on:

Though I'm not helping to implement RP2, I do have a software engineering background (mostly with Java), and I have done a decent amount of design and refactoring work in my time. As I read the above, honestly, it make me nervous. The idea of permanently adding artificial transactions and a second date (cost_basis_timestamp) to the design is the sort of thing that "gives me pause". Though I can't claim to totally understand the approach you're suggesting, if I were in a similar situation this sort of added complexity would cause me to take a step back and re-consider: Is there a better, cleaner way to approach this? Is there a way to do this that will ensure I don't have to deal with this extra "baggage" (the artificial transactions and extra date) forever more? Because with this added complexity comes many more opportunities to miss something and introduce a logic bug. And it could be the sort of bug that might not easily be caught.

I may just be speaking nonsense here, but maybe what I'm saying will at least trigger some useful ideas:

If it were me, I'd take a step back and ask myself, "if I were designing RP2 from scratch, what data structure would I design that would support both universal and per-wallet application in the cleanest, most logical way?" Presumably such a data structure would not include artificial transactions and extra dates. Then I'd do something along the following lines:

  • Refactor the code as necessary to ensure that universal application works correctly with the new data structure.
  • Add any additional logic to support per-wallet application with the new data structure. (Hopefully it wouldn't require a lot of additional code. Ideally universal and per-wallet should be common code with very little custom logic to support the two "applications".)
  • Write "global allocation" as a one-off data transform process that changes the pre-"global allocation" "universal" data into "per-wallet" data. (This could require writing to a temporary file, or simply transforming in-memory.)

The idea here is that the bulk of what RP2 does -- represented by steps 1 ("universal application") and 3 ("per-wallet application") -- should be as clean, easy to understand, unified, and hopefully immune to logic bugs as possible. That code is what you (and other RP2 devs) are going to have to live with long after these "global allocation" changes are in the distant past. Is dealing with artificial transactions and extra dates years from now the best approach? Is there a better alternative? (Anything that's going to require devs to also understand "global allocation" years from now seems less than ideal.)

All of that said, maybe what you have is the cleanest, best approach for supporting the requirements. It may be. (Maybe I just don't understand it well enough.)

You're the expert here! Guess I just want to maybe stir some thought about possible alternatives. Hopefully it's at least somewhat helpful as you plan the next steps.

Thanks for considering.

@gbtorrance
Copy link

One more comment:

But, to further clarify the point immediately above, in step 3 ("per-wallet application"), the dates used for determining whether buy transactions are earlier than sell transactions should be the dates associated with the basis allocated to the wallet in step 2 ("global allocation"), not the dates when the "physical" coins in that wallet were originally purchased. (As I understand it, those dates are now irrelevant, because global allocation has "re-written" the coins with new dates and associated basis.)

I'm not sure if we understand this in the same way.

To summarize/re-state this, I think that per-wallet application should only ever use one date, and it should be the date associated with the cost basis lots that were assigned to the wallet during global allocation. (Any other date that even exists in the data structure is liable to cause confusion and result in bugs, IMO.)

Thoughts?

@eprbell
Copy link
Owner Author

eprbell commented Jan 27, 2025

Answers inline below (they reflect my latest understanding of the rules, which has changed since I first wrote that example).

Here's your example as a starting point.

1/23 Buy 2 BTC on CB, spot=$10K
2/23 Sell 1 BTC on CB, spot=$20K
1/24 Buy 3 BTC on Kraken, spot=$50K
2/24 Sell 1 BTC on CB, spot=$60K

In Step 1 ("universal application") with HIFO the 2/23 sell on CB should be paired with the 1/23 buy on CB for the purpose of determining the cost basis attributed to the sell. (No others transactions have occurred yet, and sells have to be paired with earlier buys.) Then the 2/24 sell on CB should be paired with the 1/24 buy on Kraken, as that has the highest unused basis at that point in time.

What remains for unused basis should be the following:

1 x BTC, spot=$10K, originally bought 1/23 on CB
2 x BTC, spot=$50K, originally bought 1/24 on Kraken

What remains for actual coins on actual wallets should be the following:

0 x BTC on CB
3 x BTC on Kraken

Now it's time to run Step 2 ("global allocation"). You've said you want to allocate first to CB, then Kraken, right? There are no coins remaining on CB, so all unused basis should go to the coins on Kraken. But to step this through methodically (assuming HIFO as the accounting method) you'd first allocated the 2 x BTC bought for $50K to two of the coins on Kraken, and then you'd allocate the 1 x BTC bought for $10k to the remaining coin on Kraken.

Correct so far.

As I understand it, the original purchase dates and basis should remain associated in a fixed relationship with one another. Even though the 3 coins that remain on Kraken were purchased on 1/24, once step 2 ("global allocation") is complete, that date is essentially irrelevant. The dates associated with the coins post-"global allocation" should be the dates associated with the unused basis that has been allocated to that wallet.

So, after global allocation, Kraken should essentially hold the following:

1 x BTC, spot=$10K, bought 1/23 (the fact it was bought on CB is now irrelevant and ancient history)
2 x BTC, spot=$50k, bought 1/24 (this was coincidentally bought on Kraken, but again, not relevant)

No, I think the original date of purchase is still relevant, because it allows RP2 to distinguish long vs short-term capital gains. I think global allocation should not have the effect of resetting the type of gains. So the new per-wallet model keeps both dates around.

For step 3 ("per-wallet application"), say for example the user wanted to use FIFO (which may be a requirement for U.S. users) as the accounting method. At this point if there were a sell of 2 x BTC on 1/25, the first basis to be used would be from the 1 x BTC at $10k as it's "first in" at 1/23. And the second basis to be used would be from the 2 x BTC at $50K as it is "second in" at 1/24.

Correct.

What would remain in the Kraken wallet subsequent the 1/25 sell would be:

1 x BTC, spot=$50k, bought 1/24

Correct.

@gbtorrance
Copy link

No, I think the original date of purchase is still relevant, because it allows RP2 to distinguish long vs short-term capital gains. I think global allocation should not have the effect of resetting the type of gains. So the new per-wallet model keeps both dates around.

Let's be clear about the difference between the original purchase date associated with a "cost basis lot" (always a fixed date+basis pair) and the original purchase date associated with a "physical coin". (Forgive my odd use of "physical" here, but it's how I think of an actual coin residing in an actual wallet.)

I believe we disagree here. You're saying that the original purchase date of the "physical coin" is relevant for determining long-term vs. short term, right? I disagree. All that should matter is the original purchase date of the cost basis lot that is assigned to the coin. If it were as you're saying, that would enable the taxpayer to massively affect how much they pay in taxes by reassigning cost basis (and therefore gains/losses when selling) between "physical coins" with different original purchase dates, effectively turning large short-term gains into long-term gains (or visa versa). I can't imagine the IRS ever being OK with that.

Am I missing something?

@eprbell
Copy link
Owner Author

eprbell commented Jan 27, 2025

Pairing of transactions in steps 1 and 3 will always pair sell transactions with earlier buy transactions

Correct: that's also how RP2 works today. And I think it should work the same way for steps 1, 2 and 3.

and I think universal application, global allocation and per-wallet application should continue to behave in the same way.

FWIW, though I don't think there is any harm in implicitly including step 2 in this, as I understand it, step 2 doesn't really have the same logic as steps 1 and 3, as there is no pairing of buy and sell transactions in step 2 and, therefore, no requirement to enforce that "only earlier dates are considered". Step 2 is simply using the accounting method to order transactions for global allocation to wallets. (If I'm misunderstanding, please let me know.)

Yes, this is correct. I think, semantically speaking, global allocation is equivalent to a bunch of zero-fee transfers at the end of the year to reallocate funds according to the allocation method.

Though I'm not helping to implement RP2, I do have a software engineering background (mostly with Java), and I have done a decent amount of design and refactoring work in my time. As I read the above, honestly, it make me nervous. The idea of permanently adding artificial transactions and a second date (cost_basis_timestamp) to the design is the sort of thing that "gives me pause". Though I can't claim to totally understand the approach you're suggesting, if I were in a similar situation this sort of added complexity would cause me to take a step back and re-consider: Is there a better, cleaner way to approach this? Is there a way to do this that will ensure I don't have to deal with this extra "baggage" (the artificial transactions and extra date) forever more? Because with this added complexity comes many more opportunities to miss something and introduce a logic bug. And it could be the sort of bug that might not easily be caught.

I may just be speaking nonsense here, but maybe what I'm saying will at least trigger some useful ideas:

If it were me, I'd take a step back and ask myself, "if I were designing RP2 from scratch, what data structure would I design that would support both universal and per-wallet application in the cleanest, most logical way?" Presumably such a data structure would not include artificial transactions and extra dates. Then I'd do something along the following lines:

* Refactor the code as necessary to ensure that universal application works correctly with the new data structure.

* Add any additional logic to support per-wallet application with the new data structure. (Hopefully it wouldn't require a lot of additional code. Ideally universal and per-wallet should be common code with very little custom logic to support the two "applications".)

* Write "global allocation" as a one-off data transform process that changes the pre-"global allocation" "universal" data into "per-wallet" data. (This could require writing to a temporary file, or simply transforming in-memory.)

The idea here is that the bulk of what RP2 does -- represented by steps 1 ("universal application") and 3 ("per-wallet application") -- should be as clean, easy to understand, unified, and hopefully immune to logic bugs as possible. That code is what you (and other RP2 devs) are going to have to live with long after these "global allocation" changes are in the distant past. Is dealing with artificial transactions and extra dates years from now the best approach? Is there a better alternative? (Anything that's going to require devs to also understand "global allocation" years from now seems less than ideal.)

All of that said, maybe what you have is the cleanest, best approach for supporting the requirements. It may be. (Maybe I just don't understand it well enough.)

You're the expert here! Guess I just want to maybe stir some thought about possible alternatives. Hopefully it's at least somewhat helpful as you plan the next steps.

These are fairly generic statements. The design already has over 100 hours of thinking that went into it and, while I'm not saying it's perfect, it's the simplest, cleanest I could make it so far. I'm sure it will still change, based on my evolving understanding and feedback from others, however any change at this point will require precise pinpointing at a part of it and a strong reason justifying the change.

BTW, I do need to update the design Wiki document with the latest developments: it's a bit outdated at this point.

The reasons the design adds extra objects and attributes are to capture concepts that are not covered otherwise, once we add support for the per-wallet model:

  • The universal model doesn't have (or need) an artificial InTransaction capturing the reception of funds after a transfer: the per-wallet model does, because otherwise when operating on a per-wallet basis the RP2 tax engine would run into this error.
  • The new original cost basis timestamp is needed because otherwise the RP2 timestamp would not be able to distinguish long vs short-term capital gains after global allocation.
  • The artificial IntraTransactions added by global allocation are the cleanest way to capture the semantics of that operation: no need to add much ad-hoc logic there, just use the IntraTransaction and AccountingEngine classes we already have.

If you have time/interest, I would encourage you to look at the tests to convince yourself if the design (and implementation) work or not. They are quite easy to read and don't require familiarity with the internals: they are a set of tables, each of which shows inputs to a particular function (like transfer analysis or global allocation) and expected output. So far I uploaded the transfer analysis ones (see tests/test_transfer_analysis_semantics_dependent.py and tests/test_transfer_analysis_semantics_independent.py in #138), but global allocation is coming.

This way you could:

  • validate the design and implementation,
  • find bugs (if you notice a test isn't doing the right thing),
  • improve the CI, by suggesting new tests.

@eprbell
Copy link
Owner Author

eprbell commented Jan 27, 2025

No, I think the original date of purchase is still relevant, because it allows RP2 to distinguish long vs short-term capital gains. I think global allocation should not have the effect of resetting the type of gains. So the new per-wallet model keeps both dates around.

Let's be clear about the difference between the original purchase date associated with a "cost basis lot" (always a fixed date+basis pair) and the original purchase date associated with a "physical coin". (Forgive my odd use of "physical" here, but it's how I think of an actual coin residing in an actual wallet.)

I believe we disagree here. You're saying that the original purchase date of the "physical coin" is relevant for determining long-term vs. short term, right? I disagree. All that should matter is the original purchase date of the cost basis lot that is assigned to the coin. If it were as you're saying, that would enable the taxpayer to massively affect how much they pay in taxes by reassigning cost basis (and therefore gains/losses when selling) between "physical coins" with different original purchase dates, effectively turning large short-term gains into long-term gains (or visa versa). I can't imagine the IRS ever being OK with that.

Am I missing something?

Not sure I'm following, could you produce an example so it's easier to reason about this?

@gbtorrance
Copy link

gbtorrance commented Jan 27, 2025

Thanks for putting up with all my "devil's advocate" messages. I'm genuinely not trying to be difficult for the fun of it, but to help with considering everything from different angles. (We all have blind spots.)

Not sure I'm following, could you produce an example so it's easier to reason about this?

I was actually in the process of doing so when you replied :-) But it took me a while to think it through more thoroughly myself in order to come up with a reasonable example:

1/1/24 Bought 1 BTC on Coinbase for $40k
9/1/24 Bought 1 BTC on Kraken for $20k
2/1/25 Sold 1 BTC on Coinbase for $60k
2/1/25 Sold 1 BTC on Kraken for $60k

If you'll bear with me creating some new terminology for the purposes of discussion, let's say we have:

  • "physical date": refers to the original purchase date of a coin in a particular wallet (unaffected by global allocation).
  • "basis date": refers to the original purchase date associated with cost basis $ in a "basis lot"; a basis lot can be moved to a different wallet as part of global allocation, but the basis date and the cost basis $ will always remain associated in a fixed relationship to one another as part of the "basis lot".

So, with the above terminology, at the end of the year 2024 we'd have the following:

1 BTC on Coinbase (Physical date: 1/1/24; Basis date: 1/1/24; Basis lot: 1/1/24:$40k)
1 BTC on Kraken (Physical date: 9/1/24; Basis date: 9/1/24; Basis lot: 9/1/24:$20k)

If we completely ignore global allocation for now and just consider the 2/1/25 sell, I think we can probably agree that the sell of the 1 BTC on Coinbase will result in a long-term gain of $20k ($60k-$40k) and the sell of the 1 BTC on Kraken will result in a short-term gain of $40k ($60k-$20k).

But now let's add global allocation into the mix. And let's say, for example, that global allocation in this case results in the "swapping" of basis lots between Coinbase and Kraken. Global allocation runs at the end of 2024 (for U.S. taxpayers), so at the beginning of 2025 we'd have this:

1 BTC on Coinbase (Physical date: 1/1/24; Basis date: 9/1/24; Basis lot: 9/1/24:$20k)
1 BTC on Kraken (Physical date: 9/1/24; Basis date: 1/1/24; Basis lot: 1/1/24:$40k)

Now what happens with the sell on 2/1/25?

If I'm understanding what you're saying correctly, the "physical date" would be used to determine long-term vs. short-term capital gains. So the 1 BTC on Coinbase would be sold for a long-term gain of $40k ($60k-$20k) and the 1 BTC on Kraken would be sold for a short-term gain of $20k ($60k-$40k).

I don't think this is correct, as it would allow the taxpayer to significantly manipulate how much they pay in taxes. Rather than paying LT $20k and ST $40k (as in the earlier example), now they're paying LT $40k and ST $20k.

The way I see it, "physical date" is not relevant. The only date that should matter is "basis date" (including for determining long-term vs. short-term capital gains).

If that is the case, then the 1 BTC on Coinbase would be sold for a short-term gain of $40k ($60k-$20k) and the 1 BTC on Kraken would be sold for a long-term gain of $20k ($60k-$40k). From a tax perspective, that's exactly the situation further up in this post before we considered global allocation. And I think that's correct, because it doesn't allow the taxpayer to significantly change what they pay in taxes simply by swapping around some dates as part of global allocation. (Keep in mind that, in all these examples, all the BTC has been sold, so this is the final picture from a tax perspective.)

Does this make sense? Am I understanding what you were saying? Thoughts? Thanks.


One more addition to this: I think "basis date" should also be used for ordering transactions for accounting method, when a date is applicable (such as for FIFO or LIFO).

@gbtorrance
Copy link

These are fairly generic statements. The design already has over 100 hours of thinking that went into it and, while I'm not saying it's perfect, it's the simplest, cleanest I could make it so far. I'm sure it will still change, based on my evolving understanding and feedback from others, however any change at this point will require precise pinpointing at a part of it and a strong reason justifying the change.

Totally fair! Definitely didn't mean to minimize how much work has gone into all of this.

BTW, I do need to update the design Wiki document with the latest developments: it's a bit outdated at this point.

If you have time/interest, I would encourage you to look at the tests to convince yourself if the design (and implementation) work or not.

I would definitely like to help, and I can maybe try looking at the tests, but knowing the way my brain works I think that would be a struggle. I really need to get a top-down understanding of the design first (not just for these changes, but for RP2 as a whole) before I can properly dig into the tests.

I'll dig in some more and see if I can get a better high-level understanding of how RP2 is designed. (If there are any particular resources you can point me to, that would be helpful. But either way I'll investigate...)

@eprbell
Copy link
Owner Author

eprbell commented Jan 28, 2025

If you have time/interest, I would encourage you to look at the tests to convince yourself if the design (and implementation) work or not.

I would definitely like to help, and I can maybe try looking at the tests, but knowing the way my brain works I think that would be a struggle. I really need to get a top-down understanding of the design first (not just for these changes, but for RP2 as a whole) before I can properly dig into the tests.

I'll dig in some more and see if I can get a better high-level understanding of how RP2 is designed. (If there are any particular resources you can point me to, that would be helpful. But either way I'll investigate...)

The pointers for the new per-wallet semantics are the Wiki design (in need of update), the code and this discussion. RP2 dev docs are at: https://github.com/eprbell/rp2/blob/main/README.dev.md

However my suggestion is that hopefully understanding the design, while helpful, is not necessary for analyzing the tests: what's needed is understanding the tax rules. For example, if you understand how the process of transforming a universal queue to multiple per-wallet queues is supposed to work at a high level, you can look at the transfer analysis tests and find out if the code works as expected or not.

For example, let's consider the first test in tests/test_transfer_analysis_semantics_dependent.py in #138 (this is part of the test_transfer_analysis_success_using_multiple_accounts_and_fifo, which uses FIFO as transfer semantics):

           _Test(
                description="Interlaced in and intra transactions",
                input=[
                    InTransactionDescriptor("1", 1, 1, "Coinbase", "Bob", 110, 10),
                    IntraTransactionDescriptor("2", 2, 2, "Coinbase", "Bob", "Kraken", "Bob", 120, 4, 4),
                    InTransactionDescriptor("3", 3, 3, "Coinbase", "Bob", 130, 4),
                    IntraTransactionDescriptor("4", 4, 4, "Coinbase", "Bob", "Kraken", "Bob", 140, 10, 10),
                ],
                want={
                    Account("Coinbase", "Bob"): [
                        InTransactionDescriptor("1", 1, 1, "Coinbase", "Bob", 110, 10, to_lot_unique_ids={Account("Kraken", "Bob"): ["2/-1", "4/-2"]}),
                        IntraTransactionDescriptor("2", 2, 2, "Coinbase", "Bob", "Kraken", "Bob", 120, 4, 4),
                        InTransactionDescriptor("3", 3, 3, "Coinbase", "Bob", 130, 4, to_lot_unique_ids={Account("Kraken", "Bob"): ["4/-3"]}),
                        IntraTransactionDescriptor("4", 4, 4, "Coinbase", "Bob", "Kraken", "Bob", 140, 10, 10),
                    ],
                    Account("Kraken", "Bob"): [
                        InTransactionDescriptor("2/-1", 2, -1, "Kraken", "Bob", 110, 4, from_lot_unique_id="1", cost_basis_day=1),
                        InTransactionDescriptor("4/-2", 4, -2, "Kraken", "Bob", 110, 6, from_lot_unique_id="1", cost_basis_day=1),
                        InTransactionDescriptor("4/-3", 4, -3, "Kraken", "Bob", 130, 4, from_lot_unique_id="3", cost_basis_day=3),
                    ],
                },
                want_error="",
            ),

It's compact, self-documenting and hopefully not too hard to read:

  • description describes what the test does;
  • input contains the transactions in input to transfer_analysis (having a single queue from uni versal application). The first three numbers in the transaction descriptors are:
    • transaction unique id,
    • day from the beginning of the year (the test code derives the transaction timestamp from this): you can consider this as the timestamp,
    • row number in the imaginary .odt file we read data from.
  • want contains the results we expect to get after running transfer analysis on the input data: this would be a set of transaction lists (one per account or wallet).
  • want_error is used in tests that have bad input and therefore generate an error. This is mutually exclusive with the want field (if one is defined the other is empty).

Here's how we would reason about this test:

  • The universal queue contains 4 transactions: 2 in (on Coinbase) and 2 intra (from Coinbase to Kraken).
  • The two in-transactions are chronologically interlaced to the two intra-transactions.
  • Two accounts are referenced (Coinbase and Kraken), so we expect two lists of output transactions: one for Coinbase and one for Kraken.
  • The Coinbase account would have the two in-transactions (unchanged) and the two intra-transactions (also unchanged).
  • The Kraken account would have some artificial in-transactions to represent the new funds that arrived from Coinbase:
    • the first one would have day 2 as timestamp (because the intra-transaction occurred on day 2 and that's when the funds become available on Kraken), a cost basis day of 1 (because that's the day in which the funds were acquired via the original CB in-transaction), and an amount of 4 coins (the amount transferred with the intra-transaction);
    • the second one would have day 4 as timestamp (because the intra-transaction occurred on day 4 and that's when the funds become available on Kraken), a cost basis day of 1 (because that's the day in which the funds were acquired via the original CB in-transaction), and an amount of 6 coins (the amount transferred with the second intra-transaction is actually 10, but 6 is the leftover coins from the in-transaction on day 1)
    • the third one would have day 4 as timestamp (because the intra-transaction is from day 4 and that's when the funds become available on Kraken), a cost basis day of 3 (because that's the day in which the funds were acquired via the original CB in-transaction), and an amount of 4 coins (which are now coming from the second CB in-transaction, unlike previous funds which came from the first).

There are a few more fields in the descriptors but they are not essential for the purposes of this exercise (and I want to keep it as simple as possible).

This tests is repeated once for each of the 4 supported accounting methods. If we were to look at the same test in function
test_transfer_analysis_success_using_multiple_accounts_and_lifo, which uses LIFO as the transfer analysis method instead of FIFO, the input is the same, but the want field is a bit different than the FIFO case:

                want={
                    Account("Coinbase", "Bob"): [
                        InTransactionDescriptor("1", 1, 1, "Coinbase", "Bob", 110, 10, to_lot_unique_ids={Account("Kraken", "Bob"): ["2/-1", "4/-3"]}),
                        IntraTransactionDescriptor("2", 2, 2, "Coinbase", "Bob", "Kraken", "Bob", 120, 4, 4),
                        InTransactionDescriptor("3", 3, 3, "Coinbase", "Bob", 130, 4, to_lot_unique_ids={Account("Kraken", "Bob"): ["4/-2"]}),
                        IntraTransactionDescriptor("4", 4, 4, "Coinbase", "Bob", "Kraken", "Bob", 140, 10, 10),
                    ],
                    Account("Kraken", "Bob"): [
                        InTransactionDescriptor("2/-1", 2, -1, "Kraken", "Bob", 110, 4, from_lot_unique_id="1", cost_basis_day=1),
                        InTransactionDescriptor("4/-2", 4, -2, "Kraken", "Bob", 130, 4, from_lot_unique_id="3", cost_basis_day=3),
                        InTransactionDescriptor("4/-3", 4, -3, "Kraken", "Bob", 110, 6, from_lot_unique_id="1", cost_basis_day=1),
                    ],
                },

The Coinbase account is the same, but the Kraken one is a bit shuffled compared to its FIFO counterpart, due to HIFO.

Anyway, this is a long post, but hopefully it's helpful for anybody (including non programmers) who wants to:

  • verify the code is working as expected (or not);
  • improve the CI, by proposing new tests covering new scenarios.

@gbtorrance
Copy link

For example, if you understand how the process of transforming a universal queue to multiple per-wallet queues is supposed to work at a high level, you can look at the transfer analysis tests and find out if the code works as expected or not.

Thank you for this. I read through it a few times (and likely will a few more), and it is very helpful. I think I have a better understanding of how these particular tests work, and can hopefully spend some time going through them in more detail in the days ahead.

However my suggestion is that hopefully understanding the design, while helpful, is not necessary for analyzing the tests

I get that, and I don't mean to try and push you into doing more detailed documentation at this stage (as I realize you've got a lot on your plate, and documentation probably isn't top priority right now). However, it would help me a lot if I could understand the big picture just a little better, as I'm struggling to grasp how the high-level pieces of logic and data work together (or are intended to work together once this is all in place):

I think of the new RP2 processing as being made up of 3 sequential steps: step 1 ("universal application"), step 2 ("global allocation"), and step 3 ("per-wallet application").

Is the per-wallet transfer logic (tested by the above tests) only going to be used in step 3? (I think so.)

Can you tell me what the "universal queue" looks like? Can you tell me where I'd find that in the source? Is it just a list of unused basis lots, or is it an ordered list of all InTransactions, IntraTransactions, and OutTransactions?

And how do you know (in step 1) how much of each coin is in each wallet at any point in time? (That is known based on the IntraTransactions, right?)

Would you mind briefly describing the big picture of how this processing will occur through these 3 steps, and how the main data structures will be used?

Assume you have as input a set of RP2 spreadsheets with data from 2022 through 2026. And assume the user has configured to use LIFO for step 1 ("universal application"), HIFO for step 2 ("global allocation"), and FIFO for step 3 ("per-wallet application").

As a starting point I'll try to describe what I understand (or think I understand):

In step 1 you'd want to apply cost basis to sell transactions from the universal queue using LIFO. I assume this is all existing logic and data structures (i.e. no artificial InTransactions or "cost basis dates"), right? And you should be left with a list of unused basis for each coin, and knowledge of how many coins reside in each wallet (based on processing the IntraTransactions). Is that right?

Step 2 ("global allocation") -- which would be processed as at end of day 12/31/24 (for U.S. taxpayers) -- is the most vague for me. You'd use HIFO to allocate the unused basis to the coins in each wallet. But how? Does this involve creating artificial InTransactions and setting "cost basis date" as a starting point for subsequent processing that will be done in step 3?

And then in step 3 ("per-wallet application") I think I have a high level understanding of how that would work going forward. Obviously the accounting method would be FIFO (in this example), and it would be applied from starting point of what is output by step 2, right?

If you could help me put these various pieces together in my mind -- without needing to write a book -- I'd really appreciate it.

Thanks for putting up with all of my messages. I feel like I've probably been quite exasperating at times. (Sorry about that!) I do feel like I'm getting a better understanding, though, and I'm hoping to be able to put it to use in the days ahead (hopefully without needing to bother you as much). Thanks again!

@eprbell
Copy link
Owner Author

eprbell commented Jan 29, 2025

Answers inline below.

However my suggestion is that hopefully understanding the design, while helpful, is not necessary for analyzing the tests

I get that, and I don't mean to try and push you into doing more detailed documentation at this stage (as I realize you've got a lot on your plate, and documentation probably isn't top priority right now). However, it would help me a lot if I could understand the big picture just a little better, as I'm struggling to grasp how the high-level pieces of logic and data work together (or are intended to work together once this is all in place):

Yes, I understand. Current priority is finishing global allocation, then updating the tax engine to support the new per-wallet model. I'll try to update the design doc but not sure when I'll be able to. Thanks for the words of appreciation. I'm also grateful for your engagement: it was very helpful in shaping my understanding of transfer analysis (you found a couple of issues in the way I initially understood it).

I think of the new RP2 processing as being made up of 3 sequential steps: step 1 ("universal application"), step 2 ("global allocation"), and step 3 ("per-wallet application").

Yes, with the extra complication that it will be possible to also go back from 3 to 1: the US plug-in won't support that but the RP2 tax engine will allow other country plug-ins to do so if they want.

Is the per-wallet transfer logic (tested by the above tests) only going to be used in step 3? (I think so.)

Yes. And these tests exercise precisely step 3, which receives the universal data that is output by global allocation and transform it into a set of per-wallet data.

Can you tell me what the "universal queue" looks like? Can you tell me where I'd find that in the source? Is it just a list of unused basis lots, or is it an ordered list of all InTransactions, IntraTransactions, and OutTransactions?

It's InputData in the code.

And how do you know (in step 1) how much of each coin is in each wallet at any point in time? (That is known based on the IntraTransactions, right?)

All the accounting is managed by the accounting engine which fractions and pairs taxable events and acquired lots, keeping track of partial amounts as well.

Would you mind briefly describing the big picture of how this processing will occur through these 3 steps, and how the main data structures will be used?

That's in the Wiki design document (needs updating). Unfortunately it can't be done briefly.

Assume you have as input a set of RP2 spreadsheets with data from 2022 through 2026. And assume the user has configured to use LIFO for step 1 ("universal application"), HIFO for step 2 ("global allocation"), and FIFO for step 3 ("per-wallet application").

As a starting point I'll try to describe what I understand (or think I understand):

In step 1 you'd want to apply cost basis to sell transactions from the universal queue using LIFO. I assume this is all existing logic and data structures (i.e. no artificial InTransactions or "cost basis dates"), right?

Yes: the entry point for tax computation is the tax engine. There are artificial transactions in the current (universal-application-only) version of RP2, but they are used to model other tax situations.

And you should be left with a list of unused basis for each coin, and knowledge of how many coins reside in each wallet (based on processing the IntraTransactions). Is that right?

Yes. You're left with a partial amount for each acquired lot, containing the amount left in it (an InTransaction may be matched only partially to a tax event by the accounting engine: so the remainder needs to be tracked at the in-transaction level.

Step 2 ("global allocation") -- which would be processed as at end of day 12/31/24 (for U.S. taxpayers) -- is the most vague for me. You'd use HIFO to allocate the unused basis to the coins in each wallet. But how? Does this involve creating artificial InTransactions and setting "cost basis date" as a starting point for subsequent processing that will be done in step 3?

I have a first unfinished version of global allocation that is able to process correctly a simple test. The rough idea is:

  • compute per-wallet balances using BalanceSet;
  • loop over the wallets in the order given by the user;
    • extract acquired lots from the universal queue using the accounting method and use it to create an artificial zero-fee intra-transaction from the wallet of the current acquired lot to the current wallet from the loop;
    • repeat until the sum of all artificially transferred funds matches the balance for the current wallet, then move to the next wallet.

And then in step 3 ("per-wallet application") I think I have a high level understanding of how that would work going forward. Obviously the accounting method would be FIFO (in this example), and it would be applied from starting point of what is output by step 2, right?

Yes.

If you could help me put these various pieces together in my mind -- without needing to write a book -- I'd really appreciate it.

Thanks for putting up with all of my messages. I feel like I've probably been quite exasperating at times. (Sorry about that!) I do feel like I'm getting a better understanding, though, and I'm hoping to be able to put it to use in the days ahead (hopefully without needing to bother you as much). Thanks again!

No worries!

@gbtorrance
Copy link

Thank you! This is great! I'll take some time to dig deeper in the coming days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@orientalperil @jayr0d @gbtorrance @macanudo527 @eprbell and others