Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matmult fails on empty row #14

Closed
jo-37 opened this issue Dec 19, 2024 · 22 comments
Closed

matmult fails on empty row #14

jo-37 opened this issue Dec 19, 2024 · 22 comments

Comments

@jo-37
Copy link

jo-37 commented Dec 19, 2024

When matmulting a PDL::CCS::Nd matrix with a vector where all matrix rows corresponding to non-zero vector values are empty, this results in a runtime error:

pdl> $m = PDL::CCS::Nd->newFromDense(identity(3))

pdl> $m->dice_axis(0, 2) .= 0

pdl> $m->inplace->recode

pdl> p $m->decode

[
 [1 0 0]
 [0 1 0]
 [0 0 0]
]
 

pdl> $v = zeroes(1, 3)->set(0, 2, 1)

pdl> p $v

[
 [0]
 [0]
 [1]
]
 

pdl> $m x $v
Runtime error: $vals1 is empty: $which="PDL: Indx D [3,0] -   0.00KB", $vals1="PDL: Double D [0] -C   0.00KB", $vals="PDL: Double D [0] -C   0.00KB" at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1071.
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x5585d13c5fd8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1775
	PDL::CCS::Nd::inner(PDL::CCS::Nd=ARRAY(0x5585d13c5fd8), PDL::CCS::Nd=ARRAY(0x5585d179d128)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1801
	PDL::CCS::Nd::matmult(PDL::CCS::Nd=ARRAY(0x5585d17a9958), PDL=SCALAR(0x5585d1142d78), "") called at (eval 587) line 5

Can work around this with an ugly pre-check:

$m->xsubset1d(which $v)->[$WHICH]->isempty
@moocow-the-bovine
Copy link
Owner

oops ... thanks for the report! hopefully I'll have some time to look into it in the next few days.

moocow-the-bovine added a commit that referenced this issue Dec 20, 2024
@moocow-the-bovine
Copy link
Owner

@jo-37 I believe this is fixed by cdd34dc, and ought to work in current master (which should be appearing soon on CPAN as PDL-CCS v1.23.26).

@jo-37
Copy link
Author

jo-37 commented Dec 20, 2024

Thank you for the quick response!

Removing the pre-check now reveals another problem:

pdl> $u = PDL::CCS::Nd->newFromDense(zeroes(1, 3))

pdl> $v = PDL::CCS::Nd->newFromDense(ones(1, 3))

pdl> $u * $v
Runtime error: Tried to convert(null) at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 3083.
	PDL::convert(PDL=SCALAR(0x55d13cda6278), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 2496
	PDL::Core::alltopdl("PDL", PDL=SCALAR(0x55d13cda6278), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 71
	PDL::Core::__ANON__(PDL=SCALAR(0x55d13cda6278)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1410
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x55d13c9aefd8), PDL::CCS::Nd=ARRAY(0x55d13cda1740), "") called at (eval 419) line 5

pdl> $v * $u
Runtime error: Tried to convert(null) at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 3083.
	PDL::convert(PDL=SCALAR(0x55d13cdad238), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 2496
	PDL::Core::alltopdl("PDL", PDL=SCALAR(0x55d13cdad238), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 71
	PDL::Core::__ANON__(PDL=SCALAR(0x55d13cdad238)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1431
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x55d13cda1740), PDL::CCS::Nd=ARRAY(0x55d13c9aefd8), "") called at (eval 420) line 5

Looks like PDL->null->ccs_indx() is the common cause.

@moocow-the-bovine
Copy link
Owner

@jo-37

Looks like PDL->null->ccs_indx() is the common cause.

indeed ... that's a klunky idiom I sometimes used to create and empty pdl of a particular type ... looks like PDL core doesn't support type-conversion on nulls any more ... sigh.

Happily, it turns out that other changes to PDL core have made that klunkiness unnecessary. This case should work now in 6972a9f (current master).

In general, I think you can expect more weirdness if your CCS::Nd objects have only "missing" values (like $u in the previous example or the result of $m x $v in the original report). The library should be able to deal with that sort of thing, but it's been not heavily tested. (The above is meant as a helpful heads-up -- these are definitely real bugs you've identified and I'm grateful for the reports & insights, but I also feel obliged to let you know that you appear to be venturing into unexplored territory).

Let me know if this works for you & I can put out another release soon.

@jo-37
Copy link
Author

jo-37 commented Dec 20, 2024

Once again: Thank you very much!
With the current master I do not see any problems anymore.

Just one thing I want you to know:
I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors 😃
Actually, I'm experimenting with an algorithm where the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m. It just happened that $m x $u became zero and I'm glad I was able to track down the problem to the size of 3 😎

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 20, 2024

Just one thing I want you to know: I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors

We only have your word for that 🤔 (😉)

Thank you @moocow-the-bovine for being so quick to sort stuff out with all your libraries!

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 20, 2024

@jo-37 A new version is on CPAN with the fix. If that now works for you, do you feel like closing this?

@moocow-the-bovine
Copy link
Owner

Just one thing I want you to know: I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors

We only have your word for that 🤔 (😉)

😆²

fwiw, I didn't really suspect that you were ... and I do appreciate the minimal broken examples 🙂. There's always a chance something will "bottom out".

the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m.

🤔 .... I think the CCS::Nd x operator ought to work correctly for that ... at least if your $missing value is always zero ... but I'm not 100% sure anymore. I do remember having issues at one point with x because it can wind up producing more non-zeros in its return value than in either operand. I believe it should work correctly with the current implementation (using dimension-twiddling, *, and sumover), but you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct). There's no OO wrapper for them (yet), but the call would look something like:

ccs_matmult2d_zdd($m->_whichND, $m->_nzvals, $u, $m_x_u=zeroes(1,$u->dim(1)));

... you have to pre-allocate the output pdl $m_x_u in order to get the right dimensions out (nb @mohawk2 ... now that I think of it, this might be the ultimate reason for all of the zeroes(...) klunkiness in the PMCode ... if we only look at whichND aka ix to find the dimensions, we can end up truncating whole blocks of trailing rows/columns if they don't contain any non-missing values).

@moocow-the-bovine
Copy link
Owner

quick to sort stuff out

... and thank you @mohawk2 for obviating the need for my klunky workarounds by your work on the PDL core 😉

(also: not always that quick ... it took me over a month to change the last indx type from #6 ... sorry about that)

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 20, 2024

the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m.

🤔 .... I think the CCS::Nd x operator ought to work correctly for that ... at least if your $missing value is always zero ... but I'm not 100% sure anymore. I do remember having issues at one point with x because it can wind up producing more non-zeros in its return value than in either operand. I believe it should work correctly with the current implementation (using dimension-twiddling, *, and sumover), but you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

I'd advocate adding some tests with Matrix Market sparse stuff, that is what it's for!

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct). There's no OO wrapper for them (yet), but the call would look something like:

ccs_matmult2d_zdd($m->_whichND, $m->_nzvals, $u, $m_x_u=zeroes(1,$u->dim(1)));

... you have to pre-allocate the output pdl $m_x_u in order to get the right dimensions out (nb @mohawk2 ... now that I think of it, this might be the ultimate reason for all of the zeroes(...) klunkiness in the PMCode ... if we only look at whichND aka ix to find the dimensions, we can end up truncating whole blocks of trailing rows/columns if they don't contain any non-missing values).

If RedoDimsCode needs to scan, then that's what it has to do. But allocating something a bit bigger, and slice-ing it down, is still better than allocating zeroes. An alternative to writing your own RedoDimsCode is to pass an OtherPars=>'PDL_Indx nvar=>n', which is just a shorthand. See https://github.com/PDLPorters/pdl/blob/master/lib/PDL/Slices.pd#L1191-L1225 for how I did that for rldvec.

@moocow-the-bovine
Copy link
Owner

you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

I'd advocate adding some tests with Matrix Market sparse stuff, that is what it's for!

yeah, more tests are definitely needed. I'm grateful that this issue provided a couple of very clear, ultra-specific tests for stuff with surprisingly far-reaching consequence. Three cheers for 3x3 matrices!

OtherPars=>'PDL_Indx nvar=>n'

... that won't quite cut the butter here, I'm afraid. Among other reasons, because PDL::CCS is supposed to emulate PDL even with respect to "thread"-loops, broadcasting, etc. So we don't usually even know how many size-like-parameters (~ "logical dimensions") we need to be passing in. We could pass in a perl array (I know it's possible, but I don't remember the syntax) ... but then we'd have to deal with that in a painfully explicit and hard-to-generalize way. I'm currently thinking that it would make more sense to always pass a PDL of "logical" dimensions (~ CCS::Nd $PDIMS) into the C side of things ... but that's quite daunting, because it entails a complete overhaul.

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 20, 2024

See broadcastI in Slices for passing an array of dims (spoiler: OtherPars=>'PDL_Indx dimlist[]'). Can you give me an example sparse-dealing function that's currently constructing zeroes so I can ponder?

@jo-37
Copy link
Author

jo-37 commented Dec 20, 2024

@jo-37 A new version is on CPAN with the fix. If that now works for you, do you feel like closing this?

@mohawk2 Sure. My issues are fixed.

@moocow-the-bovine
Copy link
Owner

moocow-the-bovine commented Dec 20, 2024

sparse-dealing function that's currently constructing zeroes

it's late and my brain is full. the tough stuff (where I believei I've actually tested the threadloop-emulation) is the "usual suspects", i.e. binary operations like mult, which get wrapped with _ccsnd_binary_op_mia ... which is not fun to read (because it also does some block-wise buffering shenanigans).

Maybe the easiest place to start thinking about this would be the ufuncs - sumover etc. So random($n)->sumover is pretty trivial, but random($m,$n)->sumover and random($l,$m,$n)->sumover snould work too.

[EDIT: the ufuncs have painful-to-read wrappers too: _ufuncsub and _ufunc_ind_sub ... the difference has to do with our previous "type variables" discussion]

@jo-37
Copy link
Author

jo-37 commented Dec 20, 2024

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct).

@moocow-the-bovine Thank you for pointing to these. I'll certainly give them a try!

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 20, 2024

Maybe the easiest place to start thinking about this would be the ufuncs - sumover etc. So random($n)->sumover is pretty trivial, but random($m,$n)->sumover and random($l,$m,$n)->sumover snould work too.

[EDIT: the ufuncs have painful-to-read wrappers too: _ufuncsub and _ufunc_ind_sub ... the difference has to do with our previous "type variables" discussion]

I'm only seeing one zeroes in each of those wrappers, being for the special case of <= 1 dims. It doesn't seem like that really needs to be there, and it's not helping me understand where the use of zeroes is truly needed.

However, this isn't urgent, and tomorrow (etc) will do!

@jo-37
Copy link
Author

jo-37 commented Dec 20, 2024

@moocow-the-bovine Thank you for pointing to these. I'll certainly give them a try!

Looks like ccs_matmult_zdd() is almost twice as fast. Wow! 😲
Had to convert BAD to zero in the result - didn't see this in the docs.

@moocow-the-bovine
Copy link
Owner

moocow-the-bovine commented Dec 22, 2024

Had to convert BAD to zero in the result

yeah, I think that none of the matmult2d functions have any BAD-handling. Those haven't progressed far from the "quick & dirty" stage (a.k.a. "I really need this particular case to run fast right now so I'll just hammer out a quick PP implementation") ... which is also why there are no OO wrappers for CCS::Nd objects yet.

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 22, 2024

@moocow-the-bovine Could you steer me to stuff I can look at to better understand the need for zeroes? The normal way of things in PDL is that supplying a null, or in fact not having to do so because it'll get generated and returned, so I'm keen to find and fix any cases where that's not possible.

@moocow-the-bovine
Copy link
Owner

moocow-the-bovine commented Dec 26, 2024

Could you steer me to stuff I can look at to better understand the need for zeroes?

@mohawk2 I wish I could ... I spent some time going over the "usual suspects", but I can't see anywhere that's actually required (... except for laziness: I'm more comfortable expressing that logic in perl than in C).

Suggestions:

  • I'll add BAD value support & OO wrappers for ccs_matmult2d_zdd to close out this issue
    • EDIT: OO wrappers already existed; my mistake
  • let's move the zeroes & RedoDimsCode discussion to overhaul output dimension initialization #16
    • I'll start trying re-implement the CCS ufuncs for null-friendliness on a dedicated branch ... for simplicity, I'll just start with a hard-coded sumover
    • I will very likely need to ask for your advice & assistance with the null-handling

@mohawk2
Copy link
Collaborator

mohawk2 commented Dec 26, 2024

Great! I'll be pleased to give help where I can. I'd suggest IRC as an additional channel for quicker Q&A, but here works too because it causes notifications on IRC.

@moocow-the-bovine
Copy link
Owner

  • added non-missing BAD support for ccs_matmult2d_zdd (but not to *_sdd) in 29993bf
  • (re-)discovered that OO wrappers PDL::CCS::Nd::matmult2d_zdd and PDL::CCS::Nd::matmult2d_sdd already exist
  • released as PDL-CCS-1.23.29

closing this issue as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants