Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clLinkProgram return value when link fails #798

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SunSerega
Copy link
Contributor

@SunSerega SunSerega commented May 14, 2022

The last paragraph of clLinkProgram sais:

Otherwise clLinkProgram returns a NULL program object with an appropriate error in errcode_ret. The application should query the linker status of this program object to check if the link was successful or not. The list of errors that can be returned are:

But I see the contradiction in it because the application cannot query the linker status of the NULL program object.

It was probably meant that NULL program object is returned only if errcode_ret is neither CL_SUCCESS nor CL_LINK_PROGRAM_FAILURE. And such behavior can be implied by the text above this paragraph, but...

I tested what is returned when the linking begins but fails with this program:

void p(); // Never implemented
// Some implementations will ignore "p" if it is not used by kernel
kernel void k(int x) { p(); }

On 3 different PCs, each with 1 platform of a different vendor:

  • On NVidia with up-to-date OpenCL3.0 driver a NULL program object is returned, making it impossible to get linking logs.
  • On Intel with up-to-date OpenCL1.2 driver a valid program object is returned, as long as the link has started.
  • On AMD with derelict OpenCL2.0 driver clLinkProgram behaves either way, depending on which devices I add to the context.

(program objects returned by clLinkProgram and passed to pfn_notify (when it was not NULL) were equal to each other in all tested cases)

That is all to say, I think the current description of clLinkProgram is not strict/clear enough.


This pull is a draft because it is only an example of how the description of clLinkProgram can be improved, and I can see some problems with these changes:

  • These changes cannot be applied to OpenCL3.0 specs, because they invalidate some implementations - like one by NVidia on my PC. I guess this kind of change should go to OpenCL3.1 or something... Though I'm not sure where it would be appropriate to suggest that.

  • I made sure not to change the meaning of the spec section (except for what is returned when the link fails), but I'm unsure about some things. For instance:

    If the linking operation can begin, clLinkProgram returns a valid non-zero program object.

    This seems to imply that clLinkProgram can return a non-zero program object even when pfn_notify is not NULL. But this program object would not be usable before pfn_notify is called, and would be redundant after.

  • English is not my native language, so while I have checked my changes with Grammarly, there are probably some not obvious to me issues with my changes.

@CLAassistant
Copy link

CLAassistant commented May 14, 2022

CLA assistant check
All committers have signed the CLA.

@SunSerega
Copy link
Contributor Author

SunSerega commented Sep 6, 2022

I'm sorry if that's impatient, but it's been almost 4 months and no one even commented. Can it be because I made this as a draft? I didn't think that could be a problem because there is another draft... Though now that I look, that one also seems abandoned.

It seems pulls are actively being worked upon right now. If no one is getting notifications of this because it's a draft, I'll turn this into a proper PR in about a day...

@bashbaug bashbaug added OpenCL API Spec Issues related to the OpenCL API specification. needs-cts-coverage The CTS needs to be extended labels Sep 6, 2022
@alycm
Copy link
Contributor

alycm commented Sep 6, 2022

Hi @SunSerega, apologies that no one got to this. Using draft is okay.

Yes, I see the problem you're describing. The last three sentences in the existing spec are pretty confusing. Logically they say:

If pfn_notify is NULL then succeed or set link error.

If pfn_notify is not NULL then return quickly and provide any link errors via callback.

Otherwise clLinkProgram returns a NULL program object with an appropriate error in errcode_ret.
And also a logically impossible statement about querying the NULL program.

But "otherwise" is a confusing word because it's not immediately clear what situations weren't covered by the previous two sentences. Based on the error codes I think it's clear that it means all the listed situations except for the ones resulting in CL_LINK_PROGRAM_FAILURE.

Both clBuildProgram and clCompileProgram also take a pfn_notify parameter without having this confusion, so clLinkProgram should likely be written more like them. They describe the logic around deferred errors in the pfn_notify parameter description.

You list different behaviour for existing implementations, on a first reading of your proposed wording changes it's not clear to me whether you restrict that, but if you do that would only be possible in a new specification version.

@SunSerega
Copy link
Contributor Author

SunSerega commented Sep 6, 2022

The difference with clBuildProgram and clCompileProgram is that clLinkProgram creates a new program object.

I put the most effort into not changing the current behavior of clLinkProgram when straightening it out.
And so I think it should be a better starting point, than trying to apply descriptions of clBuildProgram/clCompileProgram.

@SunSerega
Copy link
Contributor Author

SunSerega commented Sep 6, 2022

if you do that would only be possible in a new specification version

Yes, probably. But it can also be treated as a bug in some implementations, caused by misinterpreted spec because my changes restrict only nonsensical behavior (not returning program object on CL_LINK_PROGRAM_FAILURE).

But that would only be feasible if fixing it wouldn't be too hard. In this sense, the biggest concern is NVidia's implementation, because it didn't return the program object in any of my tests, so they may not have any ready code for this.

Can you please ping someone who makes the default OpenCL layer for CUDA?

@SunSerega
Copy link
Contributor Author

SunSerega commented Sep 6, 2022

on a first reading of your proposed wording changes it's not clear to me

Well, I guess it's not directly said. Instead:

  1. The linking operation can begin if (...)

  2. Once the linking operation can begin, a valid non-zero program object is created.

  3. Any state changes of the program object created by clLinkProgram (e.g. link status or log) will be observable from either return value (...) or passed to pfn_notify (...)

So, returning NULL if inputs for clLinkProgram are valid for linking to begin is wrong according to spec with my changes.

But, while not ambiguous, I guess it can be made more direct. For instance:

Once the linking operation can begin, a valid non-zero program object is created and will be made accessible by the caller of clLinkProgram as described below.

@bashbaug
Copy link
Contributor

bashbaug commented Mar 3, 2024

This is a tricky issue!

I created #1075 for discussion. Maybe we can decide what we want to clarify first, then we can figure out what we need to change in the spec?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cts-coverage The CTS needs to be extended OpenCL API Spec Issues related to the OpenCL API specification.
Projects
Development

Successfully merging this pull request may close these issues.

4 participants