-
Notifications
You must be signed in to change notification settings - Fork 2
Use generics in create_hashes #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use generics in create_hashes #42
Conversation
| .with_precision_and_scale(20, 3) | ||
| .unwrap(); | ||
| let array_ref = Arc::new(array); | ||
| let array_ref: ArrayRef = Arc::new(array); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the major user facing change -- if you explicitly create an Arc ... something to create_hashes, before this PR the type inference algorithm would automatically determine ArrayRef. Now you need to explicitly declare that you want an ArrayRef
| &random_state, | ||
| hashes_buff, | ||
| )?; | ||
| let hashes = create_hashes([&left.columns()[0]], &random_state, hashes_buff)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it makes the callsites look much nicer
|
Wrong target repo ?! |
Hi @martin-g -- actually in this case I meant to make a PR in this repo (this is a PR to a branch in the pydantic fork -- so basically merging it will update the PR in the apache repo) |
|
OK! I noticed your comment at apache#18448 (comment) and thought that it might have been a mistake. |
| let mut hashes2 = vec![0; array.len()]; | ||
| create_hashes_from_arrays(&[array.as_ref()], &random_state, &mut hashes2) | ||
| .unwrap(); | ||
| create_hashes([array], &random_state, &mut hashes2).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this works. The whole point is that I want to call create_hashes from a function where I only have &dyn Array, which does not implement AsRef<dyn Array>.
As far as I know there's no way to convert an &dyn Array into AsRef<dyn Arary without some sort of wrapper type:
struct AsRefWrapper<'a>(&'a dyn Array);
impl<'a> AsRef<dyn Array> for AsRefWrapper<'a> {
fn as_ref(&self) -> &dyn Array {
self.0
}
}Or something like that. Which seems... a bit annoying to have to do at each call site.
I think I tried the generics method and ended up with two functions because of this limitation.
Do you know a better way around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
I think we should at least add a test for the specific usecase. Here is what I have:
#[test]
fn test_create_hashes_from_dyn_arrays() {
let int_array: ArrayRef = Arc::new(Int32Array::from(vec![1, 2, 3, 4]));
let float_array: ArrayRef =
Arc::new(Float64Array::from(vec![1.0, 2.0, 3.0, 4.0]));
// Verify that we can call create_hashes with only &dyn Array
fn test(arr1: &dyn Array, arr2: &dyn Array) {
let random_state = RandomState::with_seeds(0, 0, 0, 0);
let hashes_buff = &mut vec![0; arr1.len()];
let hashes =
create_hashes([arr1, arr2], &random_state, hashes_buff).unwrap();
assert_eq!(hashes.len(), 4,);
}
test(&*int_array, &*float_array);
}Maybe we can do this with a special trait -- I'll play around with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a special trait for this -- I don't really know why it is necessary but it does seem to work 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creative solution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"creative" could either be good or bad in this context 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think good! It's an internal implementation detail that users won't have to think about, but it allows us to make the improvements to this function for all callers without introducing a new API. It's good 👍🏻
| Arc::new(Float64Array::from(vec![1.0, 2.0, 3.0, 4.0])); | ||
|
|
||
| // Verify that we can call create_hashes with only &dyn Array | ||
| fn test(arr1: &dyn Array, arr2: &dyn Array) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is a test showing calling create_hashes with only &dyn Arrays
| // combine hashes with `combine_hashes` for all columns besides the first | ||
| let rehash = i >= 1; | ||
| hash_single_array(array, random_state, hashes_buffer, rehash)?; | ||
| pub trait AsDynArray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i assume we can't implement this for T: AsRef<dyn Array>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer is no:
conflicting implementations of trait `AsDynArray` for type `&dyn arrow::array::Array`
upstream crates may add a new impl of trait `std::convert::AsRef<(dyn arrow::array::Array + 'static)>` for type `&dyn arrow::array::Array` in future versions
🤷🏻
|
Merged let's continue in apache#18448, thanks! |
This is a proposal for updating this PR
Merging this PR will update apache#18448
Basically the idea is to change the signature of the
create_hashesso it can take any iterator of things that can turn into a &dyn Arrayrather than having multiple functionsI think it makes create_hashes a little more complicated, but the callsites are much easier to understand
This is also mostly backwards compatibable -- the only code I needed to change was tests