This file helps you to navigate through and make use of the JSON output. In addition, the structure described here also reflects the structure of the Python dictionary returned if this tool is used as a Python library:
# a simple Python example
import ccindex
res = ccindex.get("example-1.cc", ["."]) # the target file, the user include path list
for symbol in res["symbols"]:
if symbol["kind"] == "class_declaration" and symbol["base_clause"]:
base_locs = [ "%s defined at %s" % (
base["spelling"], base["definition_location"]) for base in symbol["base_clause"] ]
print("class %s inherites %s" % (symbol["spelling"], ', '.join(base_locs)))
In the following context, "compiler" refers to libclang, the compiler tooling library implemented by the LLVM/Clang project.
A string, either an absolute path, or a relative path with respect to your CWD - current working directory (not to the target file, unless it is in your CWD).
A string that indicates the location of the first character of an entity, in the format of FilePath:Line:Column
e.g. ~/dev/whatever.cc:1:5
.
For a non-user-defined entity, e.g. primitive type float
and pointer type int (*)(int, int)
, their source location strings are empty ""
.
A string that uniquely identify a symbol, consisting of the file path and a serial number (starts from 1), in the format of FilePath#Number
, e.g. example-1.cc#1
.
Each target source file outputs a JSON (README's usage section). The JSON contains a giant object that has 5 fields: errors
, time_parsing
, time_traversing
, includes
, and and most importantly, symbols
.
field | type |
---|---|
errors |
array of strings |
time_parsing |
number (floating point) |
time_traversing |
number (floating point) |
includes |
array of Header objects |
symbols |
array of Symbol objects |
Type: array of strings
An array of error strings. In the current compiler's implementation, the format of an error string is SourceLocation: Severity: Explanation
, where Severity
is one of fatal
, error
, warning
. Example: example-1.cc:101:1: error: unknown type name 'Another'
.
The errors are produced by the compiler, not by this tool.
Type: number (floating point)
Time taken, in seconds, for the compiler to parse the translation unit and build an AST.
Type: number (floating point)
Time taken, in seconds, for this tool to traverse the AST structure and produce results.
Type: array of Header
objects
An array of Header
objects. A Header
object includes:
Header field | type | meaning |
---|---|---|
file |
file path | the path of the file being included |
depth |
number (integer) | its depth on the include stack |
included_at |
source location | the location of the responsible #include directive |
The include stack structure is produced by the compiler, not by this tool.
If a header is not found, the error will be added to the errors
array above. The root cause is normally (1) the file or its directory does not exist, or (2) it is a user header but the user didn't provide any user include paths when invoking this tool, or the header is not in the provided paths.
If a header file is included by a system header, e.g. some internal C++ standard library header included by a user-facing C++ standard library header, then this header is deliberately omitted in the array, because it is often of no interest to our purpose.
If a header is directly included by the target file, then its depth
field has value 1.
Type: array of Symbol objects
An array of Symbol
objects, produced in the course of AST traversing. The following fields are present in every Symbol
object: id
, spelling
, kind
, hierarchy
, parent_kind
, location
, comment
, and usage
; other fields are dependent on the kind
field.
Only the symbols inside the target file are stored in the array.
A Symbol
object stores the meta information of a symbol on the AST.
These fields are present in every Symbol
object.
Symbol field | type | meaning |
---|---|---|
id |
symbol ID | uniquely identifies the symbol |
spelling |
string | the symbol's literal spelling |
kind |
string | the syntax kind of this symbol, e.g. class_declaration , constructor |
parent_kind |
string | the syntax kind of the immediate parent context, or (global) if the symbol is in global context |
location |
source location | the location of the symbol in source |
comment |
string | documentary comment for that symbol |
usage |
string | the usage_block inside the documentary comment |
is_member |
boolean | whether the symbol is a member of a class-like |
hierarchy |
array of Context objects | the contexts that semantically contains this symbol; order: top-down to the immediate parent |
others.. | depending on the kind field |
A symbol ID string that uniquely identifies the symbol. It consists of the file path and a serial number starting from 1.
An ID is justified because C++ allows name overloading, and using Itanium ABI's name mangling scheme results in IDs that are almost unreadable, especially if templates are involved. The source location is not suitable, either, because if two function declarations are the result of one macro instantiation, they will have the same source location.
The literal spelling of the symbol's name (and name only). No parenthesis, arguments, or template notations are present.
If the symbol is for a constructor, the spelling is the same as the class name; for a destructor, the spelling starts with a tilde, e.g. ~Class
; for an operator overload function or a conversion function, then its spelling starts with opertor
, e.g. operator+
and operator int
.
The syntax kind of the symbol.
Possible values are (sorted alphabetically):
class_declaration | class_template | constructor
conversion_function | destructor | enum_constant_declaration
enum_declaration | field_declaration | function_declaration
function_template | method | namespace
struct_declaration | type_alias_declaration | typedef_declaration
variable_declaration
For a function template, its kind
is function_template
, regardless of it is an independent template or a member template of some class-like. There is no such value method_template
.
For a static member data, its kind
is variable_declaration
; for a non-static member data, its kind
is field_declaration
.
type_alias_declaration
and typedef_declaration
are different:
using Int = int; // type_alias_declaration
typedef float Float; // typedef_declaration
The syntax kind of the symbol's semantic immediate parent. For example, if a symbol is a class constructor, then the value is class_declaration
; if a symbol is an enumeration constant, then the value is enum_declaration
. If a symbol is in the global scope, then the value is (global)
.
In addition to (global)
, the set of other possible values is a subset of the possible values of the kind
field above.
● location: source location (string)
The source location of that symbol.
See footnote 1.
See footnote 2.
● hierarchy: array of Context objects
An array of Context
objects, which represented declaration contexts that semantically includes this symbol, in the order of top-down to the immediate parent. If the symbol is in the global scope, then the array is empty.
A Context
object:
Context field | type | meaning |
---|---|---|
kind |
string | syntax kind of this context |
location |
source location | where the context symbol is declared |
spelling |
string | the context's symbol spelling |
transparent |
boolean | whether the context is transparent |
For example, in the hierarchy array of symbol foo
(same case for bar
), there are two Context
objects: the first is for class declaration A
, and the second is for enum declaration E
. One can also reason that the hierarchy array of symbol E
has one object (for class A
), and the hierarchy array for symbol A
is empty.
// example.cc
class A {
enum E { foo, bar };
};
The following fields' presence are dependent on the kind
field. For which kinds have which fields, see this section.
If the symbol is created from a macro instantiation, then from_macro
is that macro's name spelling, otherwise null
. For example, in the silly example below, the from_macro
field of foo
and bar
are CREATE_FUNC
, but that of baz
is null
.
#define CREATE_FUNC(func, op) int func(int a, int b) { return a op b; }
CREATE_FUNC(foo, +)
CREATE_FUNC(bar, -)
int baz(int a, int b) { return a * b; }
If the symbol is from_macro
is null
, then this field is the declaration text of the symbol. Otherwise, this is the macro instantiation text.
Same as above, but is formatted in a way that long line is broken into multiple lines.
If the symbol is a class template or function template, then this field is true
, otherwise false
.
An array of TemplateArg
objects, representing the template argument list in order. If the symbol is not a template (is_template
is false
), then this is an empty array.
A TemplateArg
object:
TemplateArg field | type | meaning |
---|---|---|
arg_spelling |
string | the literal spelling of the template argument |
default_expr |
string or null |
the default argument expression, null if not given |
type |
Type object | the type of this template argument |
An array of Arg
objects, representing the argument list - the formal arguments inside the parenthesis - in order. A symbol might have an empty formal argument list.
An Arg
object:
TemplateArg field | type | meaning |
---|---|---|
arg_spelling |
string | the literal spelling of the argument |
default_expr |
string or null |
the default argument expression, null if not given |
type |
Type object | the type of this argument |
● (optional) return_type: Type object or null
The return type of the function-like symbol. However, if this symbol is function-like but does not have a return type, i.e. a constructor or destructor, then this field is null
(note "no return type" is different from "has void
return type").
An array of specifiers added to the function-like or class-like symbol. If no specifiers are present, it is an empty array. The possible values are
final | override | = 0 | noexcept
According to C++, for a class-like, it is either an array of one element final
or empty.
Note that throw()
basically has the same meaning as noexcept
, and is strongly discouraged since C++17. Therefore, when listing specifiers, this tool replaces all throw()
with noexcept
.
A string indicating if no-throw guarantee is provided to a function-like. The possible values are:
guaranteed | not_guaranteed | unevaluated
A function-like is either guaranteed to not throw any exception, or is not provided with such guarantee. Generally speaking, this guarantee is provided only if noexcept
or throw()
exists; however, some complicated special rules stipulates that the guarantee is also provided to some that does not one of these two specifiers. If the compiler deems the special rules potentially valid, it will report the unevaluated
value.
This field is present if the is_member
field is true
, i.e. symbol belongs to a class-like context, in other words, its parent_kind
field holds value class_declaration
, struct_declaration
, or class_template
. It represents that symbol's access specifier. The possible values are:
public | protected | private
If the symbol represents a method, including constructor and destructor, then this field is present. This field is true
if the symbol is marked with = delete
in the source; otherwise it is false
.
If the symbol represents a method, constructor or destructor, then this field is present, representing some interesting properties. The possible array elements are:
static | const | default | delete | virtual | pure_virtual
If it is pure virtual, then both virtual
and pure_virtual
are present in the array. If delete
is present, the symbol's is_deleted
field is set true
, otherwise false
. The array can be empty.
If the symbol represents a constructor, then this field is present, representing some interesting properties. The possible array elements are:
default | delete | copy | move | converting
The array can be empty. default
is present only for constructors explicitly marked by = default
, not for every zero-argument constructor. If delete
is present, the symbol's is_deleted
field is set true
, otherwise false
. converting
is present for constructors that can be called with exactly one actual argument - note it could have more than one formal arguments given the others have default expressions.
If the symbol represents a destructor, then this field is present, representing some interesting properties. The possible array elements are:
default | delete | virtual | pure_virtual
The array can be empty. default
is present only for destructors explicitly marked by = default
. If delete
is present, the symbol's is_deleted
field is set true
, otherwise false
.
An array of Base
objects, each element representing the base class a class-like inherits, in order.
A Base
object includes:
Base field | type | meaning |
---|---|---|
spelling |
string | the literal spelling of the base class |
access |
string | the access specifier for the inheritance |
virtual_inheritance |
boolean | whether the inheritance is virtual |
definition_location |
source location | where the base is defined |
If the base is a class template, then the spelling
will contain the parameter pack, e.g. TemplateName<int>
(instead of only TemplateName
). definition_location
is the location where the base class or base class template is defined.
The field is true
if the class-like is abstract, i.e. has at least one pure virtual method. Otherwise it is false
.
Note that a class-like does not need to have a method explicitly marked with virtual
and = 0
- if the class-like inherits from an abstract class-like and does not implements all of the inherited pure virtual methods, then this class-like is abstract, too.
The size, in number of bytes, of the class-like or value-like.
If the symbol is for a class template that has a least one type parameter, then this field is null
as its size is undefined. If the symbol's type is a reference, then this field is the size of the referenced type. If the symbol's type is an array, then this field is the number of bytes in the array, not the size of each element nor number of elements.
Indicates if the class-like or value-like is plain old data. Note: this concept is to be deprecated by C++.
● (optional) type: Type object
A Type object, indicating the type of a class-like or value-like symbol.
The immediate underlying type's literal spelling for a type-alias-like.
This field is the spelling of the underlying type one-step resolved. For example, in this example this field of symbol MyInt
is Int
, not int
.
typedef int Int;
typedef Int MyInt;
The completely-resolved unerderlying type's literal spelling for a type-alias-like.
Any type-alias chain is resolved. For example, in this example this field of symbol MyInt
is int
, not Int
.
typedef int Int;
typedef Int MyInt;
If the completely-resolved unerderlying type is a type parameter, this field is (type_parameter)
. In this case, to retrieve the type parameter's name, one needs to inspect the type_alias_chain
field's last element.
An array of TypeAlias objects, representing the type-alias chain from the type itself, then the one-step resolved type, all the way to the completely-resolved canonical type, in order.
This field is present (1) if the symbol belongs to a class-like context, in other words, its parent_kind
field holds value class_declaration
or class_template
, and (2) its kind
field holds value varaible_declaration
or field_declaration
. In the conditions above:
- if
kind
isvaraible_declaration
, then this field istrue
, meaning this symbol represents a static member; - if
kind
isfield_declaration
, then this field isfalse
, meaning this symbol represents a non-static member.
This field is present if kind
is enum_declaration
and indicates if this enum is a scoped enum.
● (optional) enum_underlying_type: Type object
This field is present if kind
is enum_declaration
or enum_constant_declaration
:
- if
kind
isenum_declaration
, it indicates the underlying integer type used by the enumeration constants in this enum; - if
kind
isenum_constant_declaration
, it indicates the integer type of this enumeration constant.
The integer value of an enumeration constant.
C++ has a fairly complex type system, so it deserves to have a fairly complex representation, instead of merely the type name string.
At the top level, a Type
object includes:
Type field | type | meaning |
---|---|---|
spelling |
string | the literal spelling of the type name |
type_info |
TypeInfo object |
miscellaneous information about this type |
Members of the TypeInfo
object in the type_info
field may recursively contain Type
objects, e.g. a pointer type's type_info
will contain the type of the pointee.
In a TypeInfo
object, the following fields are always present:
TypeInfo field | type | meaning |
---|---|---|
type_size |
number (integer) | type size, in number of bytes |
is_type_alias |
boolean | whether this is a type-alias-like |
is_type_param |
boolean | whether this is a type parameter |
is_array |
boolean | whether this is an array |
is_pointer |
boolean | whether this is a pointer |
is_function |
boolean | whether this is a function type |
others.. | depends on is_* fields |
Those is_*
fields are mutually exclusive - at most one of them is true
.
Depending on which one of them is true
, there are these additional fields:
- if
is_type_alias
istrue
:
TypeInfo field | type | meaning |
---|---|---|
always-present fields.. | ||
type_alias_underlying_type |
string | one-step unerderlying type's literal spelling |
canonical_type |
string | completely-resolved unerderlying type's literal spelling |
type_alias_chain |
array of TypeAlias objects |
An array of TypeAlias objects, representing the type-alias chain from the type itself, then the one-step resolved type, all the way to the completely-resolved canonical type, in order. |
If the completely-resolved unerderlying type is a type parameter, canonical_type
is (type_parameter)
. In this case, to retrieve the type parameter's name, one needs to inspect the type_alias_chain
field's last element.
- if
is_type_param
istrue
:
TypeInfo field | type | meaning |
---|---|---|
always-present fields.. | ||
type_param_decl_location |
TypeParamDeclLoc object |
location information on this type parameter |
A TypeParamDeclLoc
object includes:
TypeParamDeclLoc field | type | meaning |
---|---|---|
always-present fields.. | ||
template_spelling |
string | the literal spelling of the template declaration |
template_location |
source location | source location of the template declaration |
param_index |
number (integer) | the type parameter's position in the declaration's template parameter list (starts from 0) |
- if
is_array
istrue
:
TypeInfo field | type | meaning |
---|---|---|
always-present fields.. | ||
array_size |
number (integer) | number of elements in the array |
array_element_type |
Type object | the type of array element |
- if
is_pointer
istrue
:
TypeInfo field | type | meaning |
---|---|---|
always-present fields.. | ||
pointee_type |
Type object | the type the pointer points to |
- if
is_function
istrue
:
No additional fields. Only the always-present fields.
A TypeAlias
object includes:
TypeAlias field | type | meaning |
---|---|---|
spelling |
string | the literal spelling of the type at the current level |
location |
source location | definition of this type at the current level |
ARTIFACT (FIXEME):
At a level, if the type is a type parameter of some template, the location
field is an empty string.
For a Symbol
object, in addition to the always-present fields, depending on the kind
field's value, it has the following fields as well. For what values those fields hold, see the field description above.
kind
:
function_declaration | constructor | destructor | conversion_function | method
function_template
For a function template, its kind
is function_template
, regardless of it is an independent template or a member template of some class-like. There is no such value method_template
. To check if a function template is a member, one needs to (1) check the boolean field is_member
, or (2) go through the hierarchy
array object's kind
field, or (3) check if the access
field exists.
In addition to those always-present fields..
Symbol fields | (non-class) function or template | constructor | destructor | conversion function | (ordinary) method or template |
---|---|---|---|---|---|
from_macro |
✓ | ✓ | ✓ | ✓ | ✓ |
declaration and declaration_pretty |
✓ | ✓ | ✓ | ✓ | ✓ |
is_template |
✓ | ✓ | ✓ | ✓ | ✓ |
template_args_list |
✓ | ✓ | ✓ | ✓ | ✓ |
args_list |
✓ | ✓ | ✓ | ✓ | ✓ |
return_type |
✓ | ✓ | ✓ | ✓ | ✓ |
specifier |
✓ | ✓ | ✓ | ✓ | ✓ |
no_throw_guarantee |
✓ | ✓ | ✓ | ✓ | ✓ |
is_deleted |
✓ | ✓ | ✓ | ✓ | |
method_property |
✓ | ✓ | ✓ | ✓ | |
constructor_property |
✓ | ||||
destructor_property |
✓ | ||||
access |
✓ | ✓ | ✓ | ✓ |
kind
:
class_declaration | struct_declaration | class_template
In addition to those always-present fields..
Symbol fields | class or struct | template |
---|---|---|
declaration and declaration_pretty |
✓ | ✓ |
is_template |
✓ | ✓ |
template_args_list |
✓ | ✓ |
specifier |
✓ | ✓ |
base_clause |
✓ | ✓ |
is_abstract |
✓ | ✓ |
size |
✓ | |
POD |
✓ | |
type |
✓ | |
access |
conditional | conditional |
access
: this field is present if parent_kind
holds value of class-like.
kind
:
variable_declaration | field_declaration | enum_constant_declaration
Symbol fields | variable | field | enum constant |
---|---|---|---|
size |
✓ | ✓ | ✓ |
POD |
✓ | ✓ | ✓ |
type |
✓ | ✓ | ✓ |
static_member |
conditional | ✓ | |
enum_underlying_type |
✓ | ||
enum_value |
✓ | ||
access |
conditional | conditional | conditional |
static_member
: this field is present if parent_kind
holds value of class-like. If this happens, this symbol represents a static member, and its kind is variable_declaration
instead of field_declaration
.
Non-static member's
kind
isfield_declaration
.
access
: this field is present if parent_kind
holds value of class-like.
In addition to those always-present fields..
kind
:
enum_declaration
Note enum_declaration
is different from enum_constant_declaration
:
enum E { // E: enum_declaration
e1, // e1: enum_constant_declaration, whose parent context is E
e2 // e2: enum_constant_declaration, whose parent context is E
};
In addition to those always-present fields..
Symbol fields | enum |
---|---|
scoped_enum |
✓ |
enum_underlying_type |
✓ |
access |
conditional |
access
: this field is present if parent_kind
holds value of class-like.
kind
:
typedef_declaration | type_alias_declaration
typedef float Float; // typedef_declaration
using Int = int; // type_alias_declaration
In addition to those always-present fields..
Symbol fields | typedef | type alias |
---|---|---|
type_alias_underlying_type |
✓ | ✓ |
type_alias_chain |
✓ | ✓ |
canonical_type |
✓ | ✓ |
access |
conditional | conditional |
access
: this field is present if parent_kind
holds value of class-like.
- In Python, you can use its standard library 'json' to convert between a JSON file and a Python structure. The
get()
method also provides you with the same Python structure, as if it was converted from the JSON file.
- Documentary comment (implemented by the compiler, not this tool):
/** * This is a documentaty comment for the symbol on the next line (var1). * This comment style can span multiple lines. */ int var1; /** This is a documentary comment for the symbol on the next line (var2). */ int var2; int var3; /*< This is a documentaty for the symbol on the same line (var3) */ /* This is NOT a documentary comment, hence it will NOT be picked up by the compiler */ int var4; int var5; // This is NOT a documentary comment, either */
- Usage block: a special set of contiguous lines inside the documentary comment block.
It starts from the comment line that begins withUsage:
, and ends with the comment line whose next comment line starts with at least 5 dashes in the comment block. If no such ending line is found, it reaches to the end of the comment block. In each line picked up as usage, the whitespaces at both ends are stripped, except for the newline\n
character.
In the example below, the usage block picked up by this tool isint n = foo(1); // use default arg\nint n = foo(1, 2);
./** * This is a documentaty comment for the symbol on the next line (var1). * Usage: int n = foo(1); // use default arg * int n = foo(1, 2); * ------------ * This is not part of usage. */ int foo(int a, int b = 0);
- Context transparency: if a context is transparent, then you don't need a scope resolution operator
::
to refer to the symbols defined in the context from outside. Except for the global scope and unscoped enum (i.e. the vanilla enum), all others are opaque. For example, to refer to classC
's methodfoo
outside the class, you have to useC::foo
, because a class context is opaque.