-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using SelectFromDatabase with table IDs finds nils instead of ID numbers. #38
Comments
hey, I released rail5 upgraded version 0.8.0.rc1 of data anonymisation and I fixed and tested SelectFromDatabase and it is working fine. refer example https://github.com/sunitparekh/test-anonymization/blob/master/dell_whitelist.rb please try and provide me feedback |
Thank you very much for looking into this. I appreciate it. Some background information: I'm using data-anonymization in a rake task, and as I'm not ready to upgrade my entire project to Rails 5, I am testing this in a copied project that tries to approximate the original situation. I'm also using mysql2 as my database adapter. Here's my test table: And this is my error: It's tempting to think it's a mysql2 error, but I don't think that's the case. I debugged in the activerecord gem, activerecored-5.0.0.1/lib/active_record/relation/batches.rb. In the in_batches method, the reorder call appears to be where this breaks. The problem is that batch order returns this: Looking at how batch_order is defined, it appears that the problem is that quoted_primary_key is empty, and in my debugging, this indeed seems to be the case. So it almost seems as if something about SelectFromDatabase is losing a concept of the primary key, even though I did set that manually in this example. |
I'm also experiencing this issue when using Rails v5.1.5. It only occurs when a
|
I managed to fix this by altering See: https://gist.github.com/olly/6388e7db348d1023e340109ea9ce0362 |
@olly thanks for sending patch to fix issues. I merged your patch and released gem version to v0.8.2 |
This is a great gem. I'm using 0.7.3 right now and the Blacklist strategy to anonymize some sensitive data. I'm looking forward to using the parallel table execution strategy, but for the sake of this bug report I am using the normal sequential execution.
I got very excited about the SelectFromDatabase method and endeavored to use it for a number of things in order to "scramble" table references. First, and this is minor, the example at http://www.rubydoc.info/github/sunitparekh/data-anonymization/DataAnon/Strategy/Field/SelectFromDatabase does not make it clear that I should reuse the connection_spec for each SelectFromDatabase call, though I did figure out that I should keep passing that in from the constructor details.
The actual issue is that I have tables such as features_trainings (in this example) where I want to scramble the feature referred to in the linking table. I want to take the feature_id foreign key and replace it such that it refers to any random feature in the features table. So for example, I have this:
(Though the composite key is used here, I observe the same issue with tables that have primary_key 'id' in use.)
What happens is that the values returned by anonymizing are all nils. I tracked this down to select_from_database.rb, line 17. The odd thing is that source.select(field_name) returns a value like this:
For some reason, all the IDs are nil. Interestingly,
returns nil, but
returns the actual id number (1, 2, etc.). Part of me thinks that changing that call on that line slightly would fix it, but I have a sense that what is going on may be more subtle than that, since source produces all those nils.
Of course, one workaround is obvious, and I will move to doing my own separate query for feature ids and using SelectFromList. But it seemed to me that SelectFromDatabase could be a powerful tool, and I thought this was worth reporting. Thanks again for this gem and its documentation.
The text was updated successfully, but these errors were encountered: