Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using kinsumer cross aws accounts delays/ lose the data #52

Open
rhythmsharma opened this issue Feb 18, 2020 · 6 comments
Open

Using kinsumer cross aws accounts delays/ lose the data #52

rhythmsharma opened this issue Feb 18, 2020 · 6 comments

Comments

@rhythmsharma
Copy link

I am using kinsumer cross accounts and can see significant delay and sometimes lossing data while reading from Next(). in the same account as kinesis, kinsumer works fine but not when using in different aws account.
As there are multiple consumers (total count: 3), I have increased the throttleDelay to 750ms but that does not help much. My usecase is to intialize the kinsumer, run and stop every 500ms once.
Is this a known issue with kinsumer? any solution?

@garethlewin
Copy link
Contributor

Hi.

At twitch we use kinsumer and kinesis across acounts a lot without any impact, the kinesis queues themselves are not hosted in your account.

There are two things I would check that I believe could cause the issues you describe

  1. Cross region? Are you consuming from queues in the same region in both tests?
  2. Shard counts? Are your streams configured with the same shard counts in both tests?

@rhythmsharma
Copy link
Author

rhythmsharma commented Feb 19, 2020

Hi Gareth, thank you for the response!
To your questions:

  1. Consuming in the same region
  2. Consuming from the same Kinesis stream so I believe the same shard count.

My use case is to run kinsumer consumer in every 250ms and the number of records could be thousands in 1s. After one run, I listen to stream for 500ms using ticker and then stop the consumer.
Again, the cycle repeats: initialize the kinsumer consumer with new config, appName but same dynamoDB tables, same kinesis connection, and same dynamo connection.

My guess of what is going wrong with kinsumer ->

As there is a check in the Run() method to verify dynamoDB tables is 'ACTIVE' state or not, I guess my full cycle is getting skipped there. DynamoDB table shows 'UPDATING' state quite often.

Is there any way to pause kinsumer consumer while reading data from a stream instead of stopping, initialize and running every time?

@garethlewin
Copy link
Contributor

Hi.

For dynamodb tables to be UPDATING it means that they were just created. I recommend not deleting them and recreating them. Kinsumer is also not really designed to be run for 250ms and then shut off, so your testing might not be a valid test of kinsumer (or Kinesis) throughput.

If all you want to do is to test how fast reading from Kinesis is without using a store for checkpoints, I would recommend just calling the kinesis API directly.

@rhythmsharma
Copy link
Author

I need to keep track of checkpoints.

@garethlewin
Copy link
Contributor

In that case you shouldn't need to delete the dynamo tables. Those tables are where the checkpoints are stored.

@rhythmsharma
Copy link
Author

Yes, checkpoints are in tables and I am not deleting those. But when there are too quick updates on dynamo tables, that is where 'UPDATING' state shows up and takes a while to come back to 'ACTIVE' state.
Looks like kinsumer is best in other use cases but not in this one. Thanks for all the information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants