Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising semantic model and IOp behavior around nullable. #75964

Open
CyrusNajmabadi opened this issue Nov 18, 2024 · 6 comments
Open

Surprising semantic model and IOp behavior around nullable. #75964

CyrusNajmabadi opened this issue Nov 18, 2024 · 6 comments
Assignees
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead

Comments

@CyrusNajmabadi
Copy link
Member

See following test for a template of code demonstrating surprising behavior:

        [Fact, WorkItem(651624, "https://github.com/dotnet/roslyn/issues/60552")]
        public void TestSemanticModelConversionFromNullableToNonNullable()
        {
            var src =
                """
                #nullable enable 

                using System;

                class C
                {
                    public string Main(string? x)
                    {
                        return x;
                    }

                    public (int a, int b) Main((int, int) x)
                    {
                        return x;
                    }
                
                    public dynamic Main(object x)
                    {
                        return x;
                    }
                }
                """;

            var comp = CreateCompilation(src);

            var syntaxTree = comp.SyntaxTrees.Single();
            var semanticModel = comp.GetSemanticModel(syntaxTree);

            var root = syntaxTree.GetRoot();
            var returnStatements = root.DescendantNodesAndSelf().OfType<ReturnStatementSyntax>().ToArray();

            var conversionInfo1 = semanticModel.GetConversion(returnStatements[0].Expression);
            var conversionInfo2 = semanticModel.GetConversion(returnStatements[1].Expression);
            var conversionInfo3 = semanticModel.GetConversion(returnStatements[2].Expression);

            var typeInfo1 = semanticModel.GetTypeInfo(returnStatements[0].Expression);
            var typeInfo2 = semanticModel.GetTypeInfo(returnStatements[1].Expression);
            var typeInfo3 = semanticModel.GetTypeInfo(returnStatements[2].Expression);
        }

Here we have 3 methods, which take in a value, and return that same value, all of which go through 'identity' conversions. And, intuitively, at runtime we're not going to actually do any actual conversions here.

However, for both tuples and dynamic, you at least get a TypeInfo telling you accurately what your converted type is. And the converted type is expectedly different from the starting type. In other words, the semantic model lets you know you're goin gfrom (int,int) to (int a, int b) or from (object) to (dynamic).

However, for nullable it does not tell you this. This seems odd to me. Effectively, in the semantic model there is no way to say "the starting type is X, but it has changed to Y" despite X and Y being different types.

It's unclear to IDE team why nullable behaves this way. Other identity conversions do not, and the semantic model gives us the info we need to know that the lang is considering converting the types here. This feels like a bug. And it's a bug that definitely makes our lives harder as we cannot figure out when these changes are happening, despite them causing the compiler itself ot issue warnings.

This is highly relevant to us in terms of determining which casts we both need to add, or want to remove.

We would ask taht the semantic model indicate this information (and include in the iop tree) As it does for tehse other identity conversions with disparate types.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Nov 18, 2024
@jaredpar
Copy link
Member

@333fred, @AlekseyTs PTAL

@AlekseyTs AlekseyTs self-assigned this Nov 21, 2024
@AlekseyTs
Copy link
Contributor

Investigating

@AlekseyTs
Copy link
Contributor

Here is the unit-test that reflects the behavior as it stands today:

    [Fact, WorkItem(651624, "https://github.com/dotnet/roslyn/issues/60552")]
    public void TestSemanticModelConversionFromNullableToNonNullable()
    {
        var src =
            """
            #nullable enable 

            class C
            {
                public string Main(string? x)
                {
                    return x;
                }

                public (int a, int b) Main((int, int) x)
                {
                    return x;
                }
                
                public dynamic Main(object x)
                {
                    return x;
                }

                public string Main_(string? x)
                {
                    return (string)x;
                }
            
                public string Main__(string x)
                {
                    return (string?)x;
                }
            }
            """;

        var comp = CreateCompilation(src);

        comp.VerifyDiagnostics(
            // (7,16): warning CS8603: Possible null reference return.
            //         return x;
            Diagnostic(ErrorCode.WRN_NullReferenceReturn, "x").WithLocation(7, 16),
            // (22,16): warning CS8600: Converting null literal or possible null value to non-nullable type.
            //         return (string)x;
            Diagnostic(ErrorCode.WRN_ConvertingNullableToNonNullable, "(string)x").WithLocation(22, 16),
            // (22,16): warning CS8603: Possible null reference return.
            //         return (string)x;
            Diagnostic(ErrorCode.WRN_NullReferenceReturn, "(string)x").WithLocation(22, 16),
            // (27,16): warning CS8603: Possible null reference return.
            //         return (string?)x;
            Diagnostic(ErrorCode.WRN_NullReferenceReturn, "(string?)x").WithLocation(27, 16)
            );

        var syntaxTree = comp.SyntaxTrees.Single();
        var semanticModel = comp.GetSemanticModel(syntaxTree);

        var root = syntaxTree.GetRoot();
        var returnStatements = root.DescendantNodesAndSelf().OfType<ReturnStatementSyntax>().ToArray();

        Assert.Equal(ConversionKind.Identity, semanticModel.GetConversion(returnStatements[0].Expression!).Kind);
        Assert.Equal(ConversionKind.Identity, semanticModel.GetConversion(returnStatements[1].Expression!).Kind);
        Assert.Equal(ConversionKind.Identity, semanticModel.GetConversion(returnStatements[2].Expression!).Kind);

        AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[0].Expression!).Type.ToTestDisplayString());
        AssertEx.Equal("(System.Int32, System.Int32)", semanticModel.GetTypeInfo(returnStatements[1].Expression!).Type.ToTestDisplayString());
        AssertEx.Equal("System.Object", semanticModel.GetTypeInfo(returnStatements[2].Expression!).Type.ToTestDisplayString());
        AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[3].Expression!).Type.ToTestDisplayString());
        AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[4].Expression!).Type.ToTestDisplayString());

        AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[0].Expression!).ConvertedType.ToTestDisplayString());
        AssertEx.Equal("(System.Int32 a, System.Int32 b)", semanticModel.GetTypeInfo(returnStatements[1].Expression!).ConvertedType.ToTestDisplayString());
        AssertEx.Equal("dynamic", semanticModel.GetTypeInfo(returnStatements[2].Expression!).ConvertedType.ToTestDisplayString());

        Assert.Equal(CodeAnalysis.NullableFlowState.MaybeNull, semanticModel.GetTypeInfo(returnStatements[0].Expression!).Nullability.FlowState);
        Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[1].Expression!).Nullability.FlowState);
        Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[2].Expression!).Nullability.FlowState);

        Assert.Equal(CodeAnalysis.NullableFlowState.MaybeNull, semanticModel.GetTypeInfo(returnStatements[0].Expression!).ConvertedNullability.FlowState);
        Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[1].Expression!).ConvertedNullability.FlowState);
        Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[2].Expression!).ConvertedNullability.FlowState);
    }

If I understand the issue correctly, the source of confusion is the fact that the Type and ConvertedType for returned expression in public string Main(string? x) are both string? even though the return type is string.

It looks to me that this has a reasonable explanation. Specifically, top level nullability reflects flow state and, therefore, a cast cannot remove it. Behavior for Main_ demonstrates that. But a cast can add nullable flow state, behavior for Main__ demonstrates that.

@AlekseyTs
Copy link
Contributor

Given the above, I think the behavior is expected. I'll let @333fred to make the final call.

@AlekseyTs AlekseyTs assigned 333fred and unassigned AlekseyTs Nov 21, 2024
@333fred
Copy link
Member

333fred commented Nov 21, 2024

@AlekseyTs I believe Cyrus's concern is slightly different than your examples, though I also do come to the same conclusion that we don't have a great way to represent the "conversion". I think we can restate the issue as follows:

The IDE would like to be able to tell when a null flow state is being converted at a boundary, whether that boundary is a return position, an argument position, or an assignment position to something that won't carry that flow state (like arrays or indexers). Today, they would have to enumerate all of these positions and check against whatever that boundary type is in order to determine the "conversion" exists: comparing the flow state of the returned expression against the type of the containing method, or an argument's flow state against the parameter type. I believe this test is a better reflection of what is being asked for:

        [Fact, WorkItem(651624, "https://github.com/dotnet/roslyn/issues/60552")]
        public void TestSemanticModelConversionFromNullableToNonNullable()
        {
            var src =
                """
            #nullable enable 

            class C
            {
                public string Main_(string? x)
                {
                    return x;
                }
            
                public string? Main__(string x)
                {
                    return x;
                }
            }
            """;

            var comp = CreateCompilation(src);

            comp.VerifyDiagnostics(
                // (7,16): warning CS8603: Possible null reference return.
                //         return x;
                Diagnostic(ErrorCode.WRN_NullReferenceReturn, "x").WithLocation(7, 16)
                );

            var syntaxTree = comp.SyntaxTrees.Single();
            var semanticModel = comp.GetSemanticModel(syntaxTree);

            var root = syntaxTree.GetRoot();
            var returnStatements = root.DescendantNodesAndSelf().OfType<ReturnStatementSyntax>().ToArray();

            Assert.Equal(ConversionKind.Identity, semanticModel.GetConversion(returnStatements[0].Expression!).Kind);
            Assert.Equal(ConversionKind.Identity, semanticModel.GetConversion(returnStatements[1].Expression!).Kind);

            AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[0].Expression!).Type.ToTestDisplayString());
            AssertEx.Equal("System.String", semanticModel.GetTypeInfo(returnStatements[1].Expression!).Type.ToTestDisplayString());

            AssertEx.Equal("System.String?", semanticModel.GetTypeInfo(returnStatements[0].Expression!).ConvertedType.ToTestDisplayString());
            AssertEx.Equal("System.String", semanticModel.GetTypeInfo(returnStatements[1].Expression!).ConvertedType.ToTestDisplayString());

            Assert.Equal(CodeAnalysis.NullableFlowState.MaybeNull, semanticModel.GetTypeInfo(returnStatements[0].Expression!).Nullability.FlowState);
            Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[1].Expression!).Nullability.FlowState);

            // Would like this to be `NotNull`
            Assert.Equal(CodeAnalysis.NullableFlowState.MaybeNull, semanticModel.GetTypeInfo(returnStatements[0].Expression!).ConvertedNullability.FlowState);
            // Would like this to be `MaybeNull`
            Assert.Equal(CodeAnalysis.NullableFlowState.NotNull, semanticModel.GetTypeInfo(returnStatements[1].Expression!).ConvertedNullability.FlowState);
        }

I don't really have a good idea for how we might approach this problem. One possibility is having the converted type of these scenarios reflect the flow state that they are getting "converted" to, but I'll be the first to point out that this isn't a conversion by C# semantics, so it would be a little odd to have it reflected as such in our API.

@CyrusNajmabadi
Copy link
Member Author

If a nullable api could be provided for us to answer this, that would also be fine. The primary issue here is firmly that we just don't even know what we can ask here. And we've found trying to enumerate all cases, and reimplement all the compiler rules is just never workable. We end up reimplementing things like flow-analysis. We have no clue what needs to be updated when the lang changes. And we're likely to get our impls wrong. This is a case where there is deep complexity in the compiler (that's why we did nullable in the compielr in the first place), but very little visibility.

GetTypeInfo was always good for letting us know when a conversion between types happened (as does IOp). This is the first place we've encountered where there are two distinct ITypeSymbols for a value, but nothing in the apis whatsoever to find out that the compiler has transitioned between them. And if we do not understand this, then the user gets warnings, which is not good. So the compiler has the info somewhere to drive the diagnostics off of, but it isn't exposed through any system we can see so far. :)

A new API could help us here. like with GetTypeInfo, it could be GetNullabilityInfo, and if there's a transition, the API could reveal that. my preference is that this is in GetTypeInfo though. Perhaps a new property. .Type, .ConvertedType and maybe .NullableWhoozyWhatsitType. :) But i defer to you guys as to what data you have and how you think it could best be exposed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

No branches or pull requests

4 participants