implement dot_product, cosine_similarity/matrix for tensor ops; add shape mismatch e… #242

Force1ess · 2025-03-06T15:22:52Z

feat: implement vector operations and error handling

Resolves #136

Add dot_product function for tensor multiplication
Implement cosine_similarity and cosine_distance metrics
Add ShapeMismatch error type for tensor dimension validation
Include comprehensive tests with edge cases for all operations

This supersedes previous work in #138 and #157

…rror

crates/kornia-tensor-ops/src/ops.rs

Force1ess · 2025-03-07T17:03:12Z

Hi @edgarriba ,
I've implemented the optimized cosine_similarity function as described in the blog post and successfully reproduced comparable performance results. The benchmarks confirm significant speed improvements on my Mac M1 Pro, as shown in the attached screenshots.
One question: Since this optimized implementation doesn't use the dot_product function, should we remove that function from the codebase to reduce redundancy?

crates/kornia-tensor-ops/benches/bench_cosine_similarity.rs

crates/kornia-tensor-ops/src/ops.rs

edgarriba · 2025-03-08T14:10:26Z

crates/kornia-tensor-ops/src/ops.rs

+/// # Errors
+///
+/// If the shapes of the tensors don't match, an error is returned.
+pub fn cosine_similarity_optimized<T, const N: usize, A>(


Great ! So what do you think about providing a low level kernel function like: fn cosine_similariy_kernel_float<T>(a: &[T], b: &[T]) -> T. This will make easy to have a very clean low level api to expose to raw python types (numpy comes with an overhead to pass data from/to) and avoid for now to expose the whole Tensor API. We can do this in a future PR

Sounds great to me!

By the way, should I put it in the 'tensor-ops' crate too?

yep, create a new mod kernels and place it there. This is my suggestion

fn dot_product1_float_kernel<T>(a: &[T], b: &[T]) -> T where T: Zero + Copy + Clone + std::ops::Add<Output = T> + std::ops::Mul<Output = T>, { let mut result = T::zero(); for (a_val, b_val) in a.iter().zip(b.iter()) { result = result + *a_val * *b_val; } result }

or

fn product1_float_kernel<T>(a: &[T], b: &[T]) -> T where T: Zero + Copy + Clone + std::ops::Add<Output = T> + std::ops::Mul<Output = T>, { let result = a .iter() .zip(b.iter()) .fold(T::zero(), |acc, (a_val, b_val)| acc + *a_val * *b_val); result }

probably the second should leverage better rust internals to vectorise when it's possible

edgarriba · 2025-03-08T14:18:43Z

Hi @edgarriba , I've implemented the optimized cosine_similarity function as described in the blog post and successfully reproduced comparable performance results. The benchmarks confirm significant speed improvements on my Mac M1 Pro, as shown in the attached screenshots. One question: Since this optimized implementation doesn't use the dot_product function, should we remove that function from the codebase to reduce redundancy?

Results are amazing ! I’m pretty we can even optimise a bit more later. I would like to keep the two functions, specially if we can make them very performant as they are very useful for ml in general

Force1ess · 2025-03-08T17:38:42Z

Hi @edgarriba , I've implemented the optimized cosine_similarity function as described in the blog post and successfully reproduced comparable performance results. The benchmarks confirm significant speed improvements on my Mac M1 Pro, as shown in the attached screenshots. One question: Since this optimized implementation doesn't use the dot_product function, should we remove that function from the codebase to reduce redundancy?

Results are amazing ! I’m pretty we can even optimise a bit more later. I would like to keep the two functions, specially if we can make them very performant as they are very useful for ml in general

Hi, I've finished the improvements we mentioned before.
Here's the newest benchmark result:

Force1ess · 2025-03-08T17:42:47Z

By the way, I am pretty interested in optimizing these functions under your guides.
If you have any ideas, please feel free to let me know.

edgarriba · 2025-03-09T11:41:58Z

By the way, I am pretty interested in optimizing these functions under your guides. If you have any ideas, please feel free to let me know.

lot's of things to do. Curating all this low level kernels needed for ml could be an interesting direction and then efficiently exposing to python

crates/kornia-tensor-ops/Cargo.toml

edgarriba · 2025-03-09T11:32:55Z

crates/kornia-tensor-ops/src/ops.rs

+/// # Errors
+///
+/// If the shapes of the tensors don't match, an error is returned.
+pub fn cosine_similarity_optimized<T, const N: usize, A>(


yep, create a new mod kernels and place it there. This is my suggestion

fn dot_product1_float_kernel<T>(a: &[T], b: &[T]) -> T where T: Zero + Copy + Clone + std::ops::Add<Output = T> + std::ops::Mul<Output = T>, { let mut result = T::zero(); for (a_val, b_val) in a.iter().zip(b.iter()) { result = result + *a_val * *b_val; } result }

or

fn product1_float_kernel<T>(a: &[T], b: &[T]) -> T where T: Zero + Copy + Clone + std::ops::Add<Output = T> + std::ops::Mul<Output = T>, { let result = a .iter() .zip(b.iter()) .fold(T::zero(), |acc, (a_val, b_val)| acc + *a_val * *b_val); result }

probably the second should leverage better rust internals to vectorise when it's possible

edgarriba · 2025-03-09T11:37:29Z

crates/kornia-tensor-ops/src/ops.rs

+    // Calculate cosine similarity: dot_product/(sqrt(magnitude_a)*sqrt(magnitude_b))
+    let denominator = magnitude_a.sqrt() * magnitude_b.sqrt();
+    Ok(dot_product / denominator)
+}


same kernel strategy here

fn cosine_similarity_kernel<T>(a: &[T], b: &[T]) -> T where T: num_traits::Float, { let (dot_product, magnitude_a, magnitude_b) = a.iter().zip(b.iter()).fold( (T::zero(), T::zero(), T::zero()), |(dot_product, magnitude_a, magnitude_b), (a_val, b_val)| { let a = *a_val; let b = *b_val; ( dot_product + a * b, magnitude_a + a * a, magnitude_b + b * b, ) }, ); if magnitude_a == T::zero() || magnitude_b == T::zero() { return T::zero(); } let denominator = magnitude_a.sqrt() * magnitude_b.sqrt(); dot_product / denominator }

Force1ess · 2025-03-09T12:10:27Z

Hi @edgarriba ,
I've relocated the dependency as per your guides.
Is there anything else we should address in this PR?

Force1ess · 2025-03-09T12:14:38Z

By the way, I am pretty interested in optimizing these functions under your guides. If you have any ideas, please feel free to let me know.

lot's of things to do. Curating all this low level kernels needed for ml could be an interesting direction and then efficiently exposing to python

While I plan to work on implementing low-level kernels for machine learning.
Maybe I should focus on resolving GSoC-related issues first.

edgarriba · 2025-03-09T13:04:28Z

By the way, I am pretty interested in optimizing these functions under your guides. If you have any ideas, please feel free to let me know.

lot's of things to do. Curating all this low level kernels needed for ml could be an interesting direction and then efficiently exposing to python

While I plan to work on implementing low-level kernels for machine learning. Maybe I should focus on resolving GSoC-related issues first.

definitely ! but feel free to open any ticket if you think of some other interesting things to be done later. Maybe other can help too. BTW, are you in the Discord?

Force1ess · 2025-03-09T13:10:47Z

By the way, I am pretty interested in optimizing these functions under your guides. If you have any ideas, please feel free to let me know.

lot's of things to do. Curating all this low level kernels needed for ml could be an interesting direction and then efficiently exposing to python

While I plan to work on implementing low-level kernels for machine learning. Maybe I should focus on resolving GSoC-related issues first.

definitely ! but feel free to open any ticket if you think of some other interesting things to be done later. Maybe other can help too. BTW, are you in the Discord?

Of course, my nickname is forceless!
Moreover, please check if there's anything else we should complete in this PR.

edgarriba · 2025-03-09T13:16:02Z

let just move the internal functions in a kernel.rs and good to go

Force1ess · 2025-03-09T14:24:59Z

let just move the internal functions in a kernel.rs and good to go

Hi, I've implemented and placed them in the kernel crate.

edgarriba · 2025-03-09T14:47:58Z

ok, update then the tensor-ops to use the new kernels functions

edgarriba

check that the test pass in you local machine

Force1ess · 2025-03-09T15:34:31Z

@edgarriba, I've used the kernel functions and all tests passed in my machine

check that the test pass in you local machine

edgarriba · 2025-03-09T16:56:17Z

crates/kornia-tensor-ops/src/error.rs

@@ -11,4 +12,12 @@ pub enum TensorOpsError {
    /// Tensor error
    #[error("Error with the tensor: {0}")]
    TensorError(#[from] TensorError),
+
+    /// Tensor error
+    #[error("Error with the kernel: {0}")]


Suggested change

#[error("Error with the kernel: {0}")]

#[transparent]

I recently learnt this trick, to forward directly the error

edgarriba · 2025-03-09T16:59:44Z

crates/kornia-tensor-ops/src/ops.rs

+    fn test_cosine_similarity_3d_tensors() -> Result<(), TensorOpsError> {
+        let a = Tensor::<f32, 3, CpuAllocator>::from_shape_slice(
+            [2, 2, 2],
+            &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],


now seeing this test, I think we should limit also to Tensor1 this operator, as here i don't think it makes a lot of sense as it's applying to the whole tensor instead to a specific axis dimension ?

implement dot_product, cosine_similarity/matrix; add shape mismatch e…

99a2f89

…rror

edgarriba reviewed Mar 7, 2025

View reviewed changes

crates/kornia-tensor-ops/src/ops.rs Outdated Show resolved Hide resolved

crates/kornia-tensor-ops/src/ops.rs Show resolved Hide resolved

implement optimized cosine similarity, add benches

708863a

edgarriba reviewed Mar 8, 2025

View reviewed changes

add more bench examples and clean code

58c450f

edgarriba reviewed Mar 9, 2025

View reviewed changes

clean cargo

484f1f7

rename dot_product_1d to dot_product1

8c27c9b

Force1ess force-pushed the main branch from c99ca52 to 8c27c9b Compare March 9, 2025 12:39

format error.rs

fa69c83

Force1ess added 3 commits March 9, 2025 21:51

add bench for dot_product1

a512609

add i8 bench for dot_product

3fc9275

add crate-kernel: impl dot_product1 and cosine_similarity/distance

f0b9162

edgarriba reviewed Mar 9, 2025

View reviewed changes

call low-level kernels

dd9a240

edgarriba reviewed Mar 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement dot_product, cosine_similarity/matrix for tensor ops; add shape mismatch e… #242

implement dot_product, cosine_similarity/matrix for tensor ops; add shape mismatch e… #242

Force1ess commented Mar 6, 2025

Force1ess commented Mar 7, 2025

edgarriba Mar 8, 2025

Force1ess Mar 8, 2025

Force1ess Mar 8, 2025

edgarriba Mar 9, 2025

edgarriba commented Mar 8, 2025

Force1ess commented Mar 8, 2025 •

edited

Loading

Force1ess commented Mar 8, 2025

edgarriba commented Mar 9, 2025

edgarriba Mar 9, 2025

edgarriba Mar 9, 2025

Force1ess commented Mar 9, 2025

Force1ess commented Mar 9, 2025 •

edited

Loading

edgarriba commented Mar 9, 2025

Force1ess commented Mar 9, 2025 •

edited

Loading

edgarriba commented Mar 9, 2025

Force1ess commented Mar 9, 2025

edgarriba commented Mar 9, 2025

edgarriba left a comment

Force1ess commented Mar 9, 2025

edgarriba Mar 9, 2025

edgarriba Mar 9, 2025

implement dot_product, cosine_similarity/matrix for tensor ops; add shape mismatch e… #242

Are you sure you want to change the base?

implement dot_product, cosine_similarity/matrix for tensor ops; add shape mismatch e… #242

Conversation

Force1ess commented Mar 6, 2025

Force1ess commented Mar 7, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edgarriba commented Mar 8, 2025

Force1ess commented Mar 8, 2025 • edited Loading

Force1ess commented Mar 8, 2025

edgarriba commented Mar 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Force1ess commented Mar 9, 2025

Force1ess commented Mar 9, 2025 • edited Loading

edgarriba commented Mar 9, 2025

Force1ess commented Mar 9, 2025 • edited Loading

edgarriba commented Mar 9, 2025

Force1ess commented Mar 9, 2025

edgarriba commented Mar 9, 2025

edgarriba left a comment

Choose a reason for hiding this comment

Force1ess commented Mar 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Force1ess commented Mar 8, 2025 •

edited

Loading

Force1ess commented Mar 9, 2025 •

edited

Loading

Force1ess commented Mar 9, 2025 •

edited

Loading