matmult fails on empty row #14

jo-37 · 2024-12-19T11:11:53Z

When matmulting a PDL::CCS::Nd matrix with a vector where all matrix rows corresponding to non-zero vector values are empty, this results in a runtime error:

pdl> $m = PDL::CCS::Nd->newFromDense(identity(3))

pdl> $m->dice_axis(0, 2) .= 0

pdl> $m->inplace->recode

pdl> p $m->decode

[
 [1 0 0]
 [0 1 0]
 [0 0 0]
]
 

pdl> $v = zeroes(1, 3)->set(0, 2, 1)

pdl> p $v

[
 [0]
 [0]
 [1]
]
 

pdl> $m x $v
Runtime error: $vals1 is empty: $which="PDL: Indx D [3,0] -   0.00KB", $vals1="PDL: Double D [0] -C   0.00KB", $vals="PDL: Double D [0] -C   0.00KB" at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1071.
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x5585d13c5fd8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1775
	PDL::CCS::Nd::inner(PDL::CCS::Nd=ARRAY(0x5585d13c5fd8), PDL::CCS::Nd=ARRAY(0x5585d179d128)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1801
	PDL::CCS::Nd::matmult(PDL::CCS::Nd=ARRAY(0x5585d17a9958), PDL=SCALAR(0x5585d1142d78), "") called at (eval 587) line 5

Can work around this with an ugly pre-check:

$m->xsubset1d(which $v)->[$WHICH]->isempty

The text was updated successfully, but these errors were encountered:

moocow-the-bovine · 2024-12-19T16:10:41Z

oops ... thanks for the report! hopefully I'll have some time to look into it in the next few days.

moocow-the-bovine · 2024-12-20T10:03:02Z

@jo-37 I believe this is fixed by cdd34dc, and ought to work in current master (which should be appearing soon on CPAN as PDL-CCS v1.23.26).

jo-37 · 2024-12-20T11:59:00Z

Thank you for the quick response!

Removing the pre-check now reveals another problem:

pdl> $u = PDL::CCS::Nd->newFromDense(zeroes(1, 3))

pdl> $v = PDL::CCS::Nd->newFromDense(ones(1, 3))

pdl> $u * $v
Runtime error: Tried to convert(null) at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 3083.
	PDL::convert(PDL=SCALAR(0x55d13cda6278), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 2496
	PDL::Core::alltopdl("PDL", PDL=SCALAR(0x55d13cda6278), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 71
	PDL::Core::__ANON__(PDL=SCALAR(0x55d13cda6278)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1410
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x55d13c9aefd8), PDL::CCS::Nd=ARRAY(0x55d13cda1740), "") called at (eval 419) line 5

pdl> $v * $u
Runtime error: Tried to convert(null) at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 3083.
	PDL::convert(PDL=SCALAR(0x55d13cdad238), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 2496
	PDL::Core::alltopdl("PDL", PDL=SCALAR(0x55d13cdad238), PDL::Type=ARRAY(0x55d13bd846e8)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/Core.pm line 71
	PDL::Core::__ANON__(PDL=SCALAR(0x55d13cdad238)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.32.1/PDL/CCS/Nd.pm line 1431
	PDL::CCS::Nd::__ANON__(PDL::CCS::Nd=ARRAY(0x55d13cda1740), PDL::CCS::Nd=ARRAY(0x55d13c9aefd8), "") called at (eval 420) line 5

Looks like PDL->null->ccs_indx() is the common cause.

moocow-the-bovine · 2024-12-20T15:53:17Z

@jo-37

Looks like PDL->null->ccs_indx() is the common cause.

indeed ... that's a klunky idiom I sometimes used to create and empty pdl of a particular type ... looks like PDL core doesn't support type-conversion on nulls any more ... sigh.

Happily, it turns out that other changes to PDL core have made that klunkiness unnecessary. This case should work now in 6972a9f (current master).

In general, I think you can expect more weirdness if your CCS::Nd objects have only "missing" values (like $u in the previous example or the result of $m x $v in the original report). The library should be able to deal with that sort of thing, but it's been not heavily tested. (The above is meant as a helpful heads-up -- these are definitely real bugs you've identified and I'm grateful for the reports & insights, but I also feel obliged to let you know that you appear to be venturing into unexplored territory).

Let me know if this works for you & I can put out another release soon.

jo-37 · 2024-12-20T17:06:39Z

Once again: Thank you very much!
With the current master I do not see any problems anymore.

Just one thing I want you to know:
I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors 😃
Actually, I'm experimenting with an algorithm where the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m. It just happened that $m x $u became zero and I'm glad I was able to track down the problem to the size of 3 😎

mohawk2 · 2024-12-20T17:13:00Z

Just one thing I want you to know: I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors

We only have your word for that 🤔 (😉)

Thank you @moocow-the-bovine for being so quick to sort stuff out with all your libraries!

mohawk2 · 2024-12-20T19:55:13Z

@jo-37 A new version is on CPAN with the fix. If that now works for you, do you feel like closing this?

moocow-the-bovine · 2024-12-20T19:55:30Z

Just one thing I want you to know: I do not spend my time multiplying 3x3 null-matrices with 3-d null vectors

We only have your word for that 🤔 (😉)

😆²

fwiw, I didn't really suspect that you were ... and I do appreciate the minimal broken examples 🙂. There's always a chance something will "bottom out".

the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m.

🤔 .... I think the CCS::Nd x operator ought to work correctly for that ... at least if your $missing value is always zero ... but I'm not 100% sure anymore. I do remember having issues at one point with x because it can wind up producing more non-zeros in its return value than in either operand. I believe it should work correctly with the current implementation (using dimension-twiddling, *, and sumover), but you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct). There's no OO wrapper for them (yet), but the call would look something like:

ccs_matmult2d_zdd($m->_whichND, $m->_nzvals, $u, $m_x_u=zeroes(1,$u->dim(1)));

... you have to pre-allocate the output pdl $m_x_u in order to get the right dimensions out (nb @mohawk2 ... now that I think of it, this might be the ultimate reason for all of the zeroes(...) klunkiness in the PMCode ... if we only look at whichND aka ix to find the dimensions, we can end up truncating whole blocks of trailing rows/columns if they don't contain any non-missing values).

moocow-the-bovine · 2024-12-20T19:57:52Z

quick to sort stuff out

... and thank you @mohawk2 for obviating the need for my klunky workarounds by your work on the PDL core 😉

(also: not always that quick ... it took me over a month to change the last indx type from #6 ... sorry about that)

mohawk2 · 2024-12-20T20:35:01Z

the expression $v * ($m x $u) is applied to 10000-d vectors $u and $v and a 10000x10000 matrix $m.

🤔 .... I think the CCS::Nd x operator ought to work correctly for that ... at least if your $missing value is always zero ... but I'm not 100% sure anymore. I do remember having issues at one point with x because it can wind up producing more non-zeros in its return value than in either operand. I believe it should work correctly with the current implementation (using dimension-twiddling, *, and sumover), but you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

I'd advocate adding some tests with Matrix Market sparse stuff, that is what it's for!

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct). There's no OO wrapper for them (yet), but the call would look something like:
ccs_matmult2d_zdd($m->_whichND, $m->_nzvals, $u, $m_x_u=zeroes(1,$u->dim(1)));
... you have to pre-allocate the output pdl $m_x_u in order to get the right dimensions out (nb @mohawk2 ... now that I think of it, this might be the ultimate reason for all of the zeroes(...) klunkiness in the PMCode ... if we only look at whichND aka ix to find the dimensions, we can end up truncating whole blocks of trailing rows/columns if they don't contain any non-missing values).

If RedoDimsCode needs to scan, then that's what it has to do. But allocating something a bit bigger, and slice-ing it down, is still better than allocating zeroes. An alternative to writing your own RedoDimsCode is to pass an OtherPars=>'PDL_Indx nvar=>n', which is just a shorthand. See https://github.com/PDLPorters/pdl/blob/master/lib/PDL/Slices.pd#L1191-L1225 for how I did that for rldvec.

moocow-the-bovine · 2024-12-20T21:11:38Z

you might want to spend a bit more time with those 3x3 matrices to ensure correctness 🫣.

I'd advocate adding some tests with Matrix Market sparse stuff, that is what it's for!

yeah, more tests are definitely needed. I'm grateful that this issue provided a couple of very clear, ultra-specific tests for stuff with surprisingly far-reaching consequence. Three cheers for 3x3 matrices!

OtherPars=>'PDL_Indx nvar=>n'

... that won't quite cut the butter here, I'm afraid. Among other reasons, because PDL::CCS is supposed to emulate PDL even with respect to "thread"-loops, broadcasting, etc. So we don't usually even know how many size-like-parameters (~ "logical dimensions") we need to be passing in. We could pass in a perl array (I know it's possible, but I don't remember the syntax) ... but then we'd have to deal with that in a painfully explicit and hard-to-generalize way. I'm currently thinking that it would make more sense to always pass a PDL of "logical" dimensions (~ CCS::Nd $PDIMS) into the C side of things ... but that's quite daunting, because it entails a complete overhaul.

mohawk2 · 2024-12-20T21:15:11Z

See broadcastI in Slices for passing an array of dims (spoiler: OtherPars=>'PDL_Indx dimlist[]'). Can you give me an example sparse-dealing function that's currently constructing zeroes so I can ponder?

jo-37 · 2024-12-20T21:20:23Z

@jo-37 A new version is on CPAN with the fix. If that now works for you, do you feel like closing this?

@mohawk2 Sure. My issues are fixed.

moocow-the-bovine · 2024-12-20T21:22:49Z

sparse-dealing function that's currently constructing zeroes

it's late and my brain is full. the tough stuff (where I believei I've actually tested the threadloop-emulation) is the "usual suspects", i.e. binary operations like mult, which get wrapped with _ccsnd_binary_op_mia ... which is not fun to read (because it also does some block-wise buffering shenanigans).

Maybe the easiest place to start thinking about this would be the ufuncs - sumover etc. So random($n)->sumover is pretty trivial, but random($m,$n)->sumover and random($l,$m,$n)->sumover snould work too.

[EDIT: the ufuncs have painful-to-read wrappers too: _ufuncsub and _ufunc_ind_sub ... the difference has to do with our previous "type variables" discussion]

jo-37 · 2024-12-20T21:25:02Z

Also, for the $m x $u expression, you might want to have a look at ccs_matmult2d_zdd() (sparse missing-is-zero 2d matrix x dense pdl) and/or ccs_matmult2d_sdd (sparse missing-is-nonzero 2d matrix x dense pdl) ... those should be much faster than the generic x implementation (and also correct).

@moocow-the-bovine Thank you for pointing to these. I'll certainly give them a try!

mohawk2 · 2024-12-20T21:53:47Z

Maybe the easiest place to start thinking about this would be the ufuncs - sumover etc. So random($n)->sumover is pretty trivial, but random($m,$n)->sumover and random($l,$m,$n)->sumover snould work too.

[EDIT: the ufuncs have painful-to-read wrappers too: _ufuncsub and _ufunc_ind_sub ... the difference has to do with our previous "type variables" discussion]

I'm only seeing one zeroes in each of those wrappers, being for the special case of <= 1 dims. It doesn't seem like that really needs to be there, and it's not helping me understand where the use of zeroes is truly needed.

However, this isn't urgent, and tomorrow (etc) will do!

jo-37 · 2024-12-20T22:34:39Z

@moocow-the-bovine Thank you for pointing to these. I'll certainly give them a try!

Looks like ccs_matmult_zdd() is almost twice as fast. Wow! 😲
Had to convert BAD to zero in the result - didn't see this in the docs.

moocow-the-bovine · 2024-12-22T20:17:27Z

Had to convert BAD to zero in the result

yeah, I think that none of the matmult2d functions have any BAD-handling. Those haven't progressed far from the "quick & dirty" stage (a.k.a. "I really need this particular case to run fast right now so I'll just hammer out a quick PP implementation") ... which is also why there are no OO wrappers for CCS::Nd objects yet.

mohawk2 · 2024-12-22T20:28:15Z

@moocow-the-bovine Could you steer me to stuff I can look at to better understand the need for zeroes? The normal way of things in PDL is that supplying a null, or in fact not having to do so because it'll get generated and returned, so I'm keen to find and fix any cases where that's not possible.

moocow-the-bovine · 2024-12-26T08:46:02Z

Could you steer me to stuff I can look at to better understand the need for zeroes?

@mohawk2 I wish I could ... I spent some time going over the "usual suspects", but I can't see anywhere that's actually required (... except for laziness: I'm more comfortable expressing that logic in perl than in C).

Suggestions:

I'll add BAD value support ~~& OO wrappers~~ for ccs_matmult2d_zdd to close out this issue
- EDIT: OO wrappers already existed; my mistake
let's move the zeroes & RedoDimsCode discussion to overhaul output dimension initialization #16
- I'll start trying re-implement the CCS ufuncs for null-friendliness on a dedicated branch ... for simplicity, I'll just start with a hard-coded sumover
- I will very likely need to ask for your advice & assistance with the null-handling

mohawk2 · 2024-12-26T13:35:36Z

Great! I'll be pleased to give help where I can. I'd suggest IRC as an additional channel for quicker Q&A, but here works too because it causes notifications on IRC.

moocow-the-bovine · 2024-12-28T12:21:31Z

added non-missing BAD support for ccs_matmult2d_zdd (but not to *_sdd) in 29993bf
(re-)discovered that OO wrappers PDL::CCS::Nd::matmult2d_zdd and PDL::CCS::Nd::matmult2d_sdd already exist
released as PDL-CCS-1.23.29

closing this issue as completed

moocow-the-bovine added a commit that referenced this issue Dec 20, 2024

add tests for #14

bc32da7

moocow-the-bovine mentioned this issue Dec 20, 2024

overhaul output dimension initialization #16

Closed

3 tasks

moocow-the-bovine closed this as completed Dec 28, 2024

jo-37 mentioned this issue Jan 9, 2025

New solution to challenge 299 task 2 manwar/perlweeklychallenge-club#11420

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

matmult fails on empty row #14

matmult fails on empty row #14

jo-37 commented Dec 19, 2024

moocow-the-bovine commented Dec 19, 2024

moocow-the-bovine commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

jo-37 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024 •

edited

Loading

jo-37 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 22, 2024 •

edited

Loading

mohawk2 commented Dec 22, 2024

moocow-the-bovine commented Dec 26, 2024 •

edited

Loading

mohawk2 commented Dec 26, 2024

moocow-the-bovine commented Dec 28, 2024

matmult fails on empty row #14

matmult fails on empty row #14

Comments

jo-37 commented Dec 19, 2024

moocow-the-bovine commented Dec 19, 2024

moocow-the-bovine commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

jo-37 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 20, 2024 • edited Loading

jo-37 commented Dec 20, 2024

mohawk2 commented Dec 20, 2024

jo-37 commented Dec 20, 2024

moocow-the-bovine commented Dec 22, 2024 • edited Loading

mohawk2 commented Dec 22, 2024

moocow-the-bovine commented Dec 26, 2024 • edited Loading

mohawk2 commented Dec 26, 2024

moocow-the-bovine commented Dec 28, 2024

moocow-the-bovine commented Dec 20, 2024 •

edited

Loading

moocow-the-bovine commented Dec 22, 2024 •

edited

Loading

moocow-the-bovine commented Dec 26, 2024 •

edited

Loading