Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: textDocument/documentSymbol (Outline) #75

Closed
wants to merge 14 commits into from

Conversation

Bochlin
Copy link
Member

@Bochlin Bochlin commented Apr 26, 2020

Implements support for the "textDocument/documentSymbol" request.

The current implementation generate the outline of design units.

Major points:

  • Support for "textDocument/documentSymbol" request.
  • Adds source_range: SrcPos field to AST design units denoting the range from the beginning of the context clause to the semi colon at the end of the design unit.
  • New HasDocumentSymbol trait in vhdl_ls/document_symbol for generating a lsp_types::DocumentSymbol. Implemented for design units.
  • New HasSymbolKind trait vhdl_ls/document_symbol for getting a lsp_types::SymbolKind. Implemented for design units.

A couple of questions @kraigher :

  1. The mapping between AST-object types and SymbolKinds could be discussed. Intention is to try and get a unique mapping between EntityClass enum and SymbolKinds as far as possible while still "making sense".
  2. Text in the detail field could be discussed. Currently it is used to present the object kind as in most cases there is no SymbolKind which is a direct match for the VHDL types (e.g. entity is mapped to interface).
  3. I plan to implement the hierarchy down to declarations and concurrent statements, would you like incremental pull requests or one single for the full hierarchy?
  4. General comments on the implementation would be most welcome.

@Bochlin
Copy link
Member Author

Bochlin commented Apr 26, 2020

Clippy seems to have found some new things to be upset about in 1.43.

@@ -375,7 +377,8 @@ end;
block_spec: code.s1("rtl(0)").name(),
use_clauses: vec![],
items: vec![],
}
},
source_range: source_range(&code, (0, 0), (3, 4))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hard coded line numbers and colums makes the tests fragile and harder to maintain. I have avoided it so far by using test helper to compute source ranges from textual searches such as finding the nth substring. In this case it would be nice to create a test helper that finds a source range between two substrings instead of hardcoding it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@Bochlin
Copy link
Member Author

Bochlin commented May 1, 2020

When looking further into how to divide the the hierarchy I realized that there are other ranges that are required to keep track of. E.g., in order to present the ports/generics as a sub-hierarchy I need to track the range between port/generic and the closing ;. Similar goes for dividing the declarative region and the statements.

I think that a solution would be to change the source_range field to a new struct:

pub struct SourceRanges {
    pub unit: SrcPos, // Position of unit start token (e.g. entity) to semicolon at the end of design unit
    pub generic_clause: Option<SrcPos>, // Range from 'generic' to ';' (inclusive) 
    pub port_clause: Option<SrcPos>, // Range from 'port' to ';' (inclusive)
    pub begin: Option<SrcPos>, // Range of 'begin' token
    pub end: SrcPos, // Range of 'end' token
}

This would allow the creation of the following hierarchy (each level fold-able at least in VSCode)

designunitname
    context clause (start of first context item to start of SourceRanges::unit)
        <context items>
    generics
        <interface declarations>
    ports
        <interface declarations>
    declarations
        <declarations>
    statements (SourceRanges::begin to SourceRanges::end)
        <statements>

The alternative would be to to either change the port/generic clauses to WithPos<Option<Vec<InterfaceDeclaration>>> or create a new struct

struct InterfaceList {
    items: Vec<InterfaceDeclaration>,
    source_range: SrcPos
}

and similar for vectors of declarations/statements.

@kraigher
Copy link
Member

kraigher commented May 1, 2020

I like the last option the best and the first option the least. It is better to attach the range close to where it associated and tie it together with a struct like the last option. The first option seems wierd to collect all ranges by themselves in a struct even though they are associated with different parts of the AST.

@Bochlin
Copy link
Member Author

Bochlin commented May 9, 2020

Another alternative would be to strictly store the ranges of the tokens or the tokens themselves dividing the different parts , e.g. for an entity, is, begin, end, and ; and let downstream functions decide how they want to generate the ranges (or make other use of the token locations). The port list would get the new InterfaceList struct which would store the port/generic/parameter token as well as the closing ;.

A benefit of storing the token (or a subset of the token since the value and next_state are of little use) would also be that downstream function get access to the comments related to the token. It would not capture all comments, but enough to enable "documentation" comments in e.g. Hover.

This way I would claim that the information is kept with the appropriate AST object. I also think that it would introduce less logic to the parser and defer range calculation and combining of ranges to when the range is actually needed.

An updated Entity declaration would then look something like:

pub struct KeyWordToken { // Probably needs a better name. Used for keywords and separators like `;`
    pub kind: Kind,
    pub pos: SrcPos,
    pub comments: Option<Box<TokenComments>>,
}

impl From<Token> for KeyWordToken {
    ...
}

struct InterfaceList {
    items: Vec<InterfaceDeclaration>,

    // Tokens
    source_range: SrcPos,
}

pub struct EntityDeclaration {
    pub context_clause: ContextClause,
    pub ident: Ident,
    pub generic_clause: Option<InterfaceList>,
    pub port_clause: Option<InterfaceList>,
    pub decl: DeclarativeRegion,
    pub statements: ConcurrentStatementPart,

    // Tokens
    pub entity_token: KeyWordToken,
    pub is_token: KeyWordToken,
    pub begin_token: Option<KeyWordToken>,
    pub end_token: KeyWordToken,
    pub semi_token: KeyWordToken
}

@kraigher
Copy link
Member

kraigher commented May 9, 2020

I think your latest idea is good.

@Bochlin
Copy link
Member Author

Bochlin commented May 17, 2020

Updated select AST objects with tokens and the hierarchy down to declarations/statements.

The range for declarations and statements is just the range of the label or identifier, tests are not fully implemented.

@kraigher
Copy link
Member

Yes as you say, a bit of testing is missing. Otherwise looks ok.

@kraigher
Copy link
Member

kraigher commented Mar 4, 2021

Any update on this?

@Bochlin
Copy link
Member Author

Bochlin commented Mar 6, 2021

It should be close to complete and I can free some time to finalize it.

@kraigher
Copy link
Member

kraigher commented Mar 6, 2021

That would be great.

@kraigher
Copy link
Member

After looking at this again I think it would be best if the AST contained higher level information rather than tokens. I like the idea of attaching a source range directly to the InterfaceLists and DeclarativeRegion using something like a WithRange<T> wrapper struct. This also makes it easier for the parser to recover from a missing semi colon while still producing output for analysis. If the range is based on actually storing the semi colon in the struct it cannot be expressed if the semi colon is missing.

@Bochlin
Copy link
Member Author

Bochlin commented Mar 21, 2021

I would like to include the following in the outline:

  • Context clause (I think that in the outline I did it was folded into the design unit but still needs a range)
  • Primary and Secondary design units
  • Declarations, both in port/generic/parameter lists and in the declarative region of the design units.
  • Concurrent statements (concurrent signal assignments would be folded into a group).

What would the difference be between a WithPos and a WithRange? I like the concept of being able to use the range of the wrapped object to find the source. Semantically, the name would perhaps not be indicative of it describing a range but functionally it would be identical.

To clarify the basic principle:

  • Structs for AST objects (such as EntityDeclaration) should be extended with a SrcPos field
  • Collections of AST objects using "standard" types such as vector should be wrapped in a new WithRange or the existing WithPos.

@kraigher
Copy link
Member

You are right that WithPos is already there and a WithRange is redundant. A SrcPos is already a range.

@Bochlin
Copy link
Member Author

Bochlin commented Jul 26, 2021

Closing in favor of PR #114

@Bochlin Bochlin closed this Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants