Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: more ufuncs to API std #124

Merged
merged 4 commits into from
Jan 3, 2023

Conversation

tylerjereddy
Copy link
Contributor

for j in range(view.extent(2)):
out[tid][i][j] = floor(view[tid][i][j])

def floor(view):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lack of docstring is just the result of me not worrying about the docstrings in the auto-templating code I linked... we could add some into the template

There's also the question of whether it makes sense to commit auto-generated code vs. the stuff that generates it; conversely, committing to a templating engine could be annoying as well. Certainly, NumPy used its own custom templating language for years for generating the ufunc inner loops. I think they're starting to transition to C++ these days, at least in some limited areas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add the docstring in this PR. Regarding committing the templating engine, I would say not to do that for now. Since we're planning on getting rid of type annotations in the future, writing these kernels should be less tedious in the future.

def ceil_impl_3d_float(tid: int, view: pk.View3D[pk.float], out: pk.View3D[pk.float]):
for i in range(view.extent(1)):
for j in range(view.extent(2)):
out[tid][i][j] = ceil(view[tid][i][j])
Copy link
Contributor Author

@tylerjereddy tylerjereddy Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I'm afraid we may have to add some custom tests to supplement b/c I'm not seeing pk_cpp generated by test suite for ceil, floor, round, trunc.. (they're just hitting the integer special case I think)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round should also work once we add it to the set of allowed math functions

Copy link
Contributor

@NaderAlAwar NaderAlAwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to add round to the set of math functions and some short docstrings to the ufuncs

def round_impl_3d_float(tid: int, view: pk.View3D[pk.float], out: pk.View3D[pk.float]):
for i in range(view.extent(1)):
for j in range(view.extent(2)):
out[tid][i][j] = round(view[tid][i][j])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work, see

math_functions: Set = {
for a list of functions that get mapped to their cmath counterparts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a round in cmath, it's just not part of this set. Could you add it here as well?

for j in range(view.extent(2)):
out[tid][i][j] = floor(view[tid][i][j])

def floor(view):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add the docstring in this PR. Regarding committing the templating engine, I would say not to do that for now. Since we're planning on getting rid of type annotations in the future, writing these kernels should be less tedious in the future.

def ceil_impl_3d_float(tid: int, view: pk.View3D[pk.float], out: pk.View3D[pk.float]):
for i in range(view.extent(1)):
for j in range(view.extent(2)):
out[tid][i][j] = ceil(view[tid][i][j])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round should also work once we add it to the set of allowed math functions

@tylerjereddy
Copy link
Contributor Author

Not quite sure I follow the concern with round here, isn't that already dealt with in my diff?

tylerjereddy added a commit to tylerjereddy/pykokkos that referenced this pull request Dec 12, 2022
* use a tagged release of the array API conformance test
suite to follow up on: kokkos#63 (comment)

* this seems safe when I test locally at least--we'll see
if the CI agrees

* this splits off the similar change from kokkosgh-124, to
simplify that PR a little
@tylerjereddy
Copy link
Contributor Author

Ok, I pushed in some revisions here now--from the commit message:

* added docstrings for the new ufuncs:
  * `round()`
  * `trunc()`
  * `ceil()`
  * `floor()`

* added a regression test to probe the above
rounding functions with floating types because I was concerned that
the array API conformance suite only checked
integer view types for these functions (which don't require compiled
kernels because integers round to themselves)

if len(view.shape) > 3:
raise NotImplementedError("only up to 3D views currently supported for round() ufunc.")

if "double" in str(view.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having three blocks for three types, can we have one if statement and then call a function to which we pass three functions, something along these lines

if "double" in str(view.dtype):
   _parallel_for(round_impl_1d_double, round_impl_2d_double, round_impl_3d_double...)
elif "float64" in str(view.dtype): 
   ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does _parallel_for know how to dispatch to the individual (1D vs. 2D vs. 3D) functions? How do you write a single _parallel_for that is shared between different ufuncs? I don't think we can easily template to the workunits without basically writing a bunch of nested logic again. And we need a separate _parallel_for for each type? I dunno, it seems like the abstraction is risky on making things simpler because the workunits don't really have metadata we can hook into for dispatching.

Copy link
Contributor Author

@tylerjereddy tylerjereddy Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could write a dictionary with the workunits as values, and dispatch through well-named keys (types, and dimensions) perhaps. We might want to agree on how that would be designed first if we adopt something like that.

That's basically early stages of writing your own type-templating system though?

elif len(view.shape) == 3:
pk.parallel_for(view.shape[0], round_impl_3d_double, view=view, out=out)

elif "float64" in str(view.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of having str(view.dtype) each time, can we save this into a variable and use it when needed

if len(view.shape) > 3:
raise NotImplementedError("only up to 3D views currently supported for trunc() ufunc.")

if "double" in str(view.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as earlier: if we have _parallel_for we can save a lot of lines as it will be used here as well

if len(view.shape) > 3:
raise NotImplementedError("only up to 3D views currently supported for ceil() ufunc.")

if "double" in str(view.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

if len(view.shape) > 3:
raise NotImplementedError("only up to 3D views currently supported for floor() ufunc.")

if "double" in str(view.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor

@gliga gliga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but it would be good we reduce some of those if blocks as suggested.

@tylerjereddy
Copy link
Contributor Author

We could reduce some code duplication, but the code is all generated from a template engine anyway, as noted above.

I suppose we could adjust the template engine, though I don't know how much easier things will be to read with more abstraction.

@tylerjereddy
Copy link
Contributor Author

I tried to write up a short proposal for improving our concision for kernel dispatch at gh-135. @gliga does that sound reasonable? I was thinking we might delay that slightly as a separate effort, since it would affect the dispatching behavior of all ufuncs I suspect.

@gliga
Copy link
Contributor

gliga commented Dec 15, 2022

@tylerjereddy this PR is OK to go then. I will check that separately and that can be revised in a diff PR. Thanks

* this uses some templating code to write 4 new API conformant
ufuncs: `round`, `trunc`, `ceil`, and `floor`. (see:
https://github.com/tylerjereddy/pykokkos_ufunc_generator )

* note that because these are automatically generated
they have some verbosity/lack of docstring quirks, bit of an experiment..

* switch to the latest tagged release of the array API test
suite rather than some random hash

* strangely, the API tests added here don't actually seem to need to compile
the workunits--they may focus on the special use case for intergers I
suppose; we may want to supplement with a few floating point examples
perhaps; see also related discussion in
data-apis/array-api-tests#154
* added docstrings for the new ufuncs:
  * `round()`
  * `trunc()`
  * `ceil()`
  * `floor()`

* added a regression test to probe the above
rounding functions with floating types because I was concerned that
the array API conformance suite only checked
integer view types for these functions (which don't require compiled
kernels because integers round to themselves)
* switch to the new ufunc dispatcher system
for concision
@tylerjereddy
Copy link
Contributor Author

I rebased this on latest develop, and adjusted to use the new ufunc dispatching/templating system, so it is a bit more concise now. We can probably abstract even more I suppose.

* adjust `pre_compile_ufuncs` module to exclude the
"private" functions in the ufuncs module
@NaderAlAwar NaderAlAwar merged commit ffd607c into kokkos:develop Jan 3, 2023
@tylerjereddy tylerjereddy deleted the treddy_ufunc_templs branch January 3, 2023 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants