Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractor exiting with code 1 ("Warning[extractor-c++]: In index_expr_node: Unknown expr kind 30.") #16854

Closed
flowerhack opened this issue Jun 27, 2024 · 6 comments
Labels
C++ question Further information is requested

Comments

@flowerhack
Copy link

Hi hello,

I'm a committer for the Chromium project & we've been experimenting with building CodeQL databases of Chromium.

Context

While building the Chromium CodeQL database, in addition to the previously-reported "catastrophic" errors ([1], [2]), we get many thousands of errors that, while they do not seem to cross the threshold to be logged as "catastrophic," they nonetheless cause the extractor to terminate with exit code 1 & lead to incomplete Chromium databases being created.

I've investigated these errors and have classed them into nine unique bug types. I intend to report all nine (this report is bug 4 of 9), with a reproducing test case for each.

The hope is that, if these bugs + the catastrophic errors are fixed, we will be able to have a complete build of a Chromium CodeQL database (barring, of course, the scenario where fixing these bugs serves to unmask new ones...!).

The Bug

When building the Chromium CodeQL database, we see ~120,000 errors of the following type:

Warning[extractor-c++]: In index_expr_node: Unknown expr kind 30.

Unfortunately the logs don't seem to point to a specific code point from which the error arose.

I'll note that I have a suspicion this may be due to CodeQL's current lack of C++20 support; Chromium has been adding C++20 features recently, and those commits correspond roughly with when these errors started occurring in great volume during our builds, and would explain why certain types of expressions are simply unknown.

If that is the case, I understand if C++20 support is not top-of-mind for your team, but we'd be curious to hear if adding that support is anywhere on your future roadmap.

Reproducing The Bug

I have created a standalone file which can be used to reproduce this bug, which is attached here as GrGlAttachment_ii.cpp.txt (please remove the .txt extension; this was to make the Github attachment uploader happy).

Reproduction steps (assumes that GrGlAttachment_ii.cpp is in /YOUR/ROOT/HERE; assumes Clang 19 (Chrome uses the latest upstream Clang, generally speaking); assumes Linux):

(1) codeql database init --language=cpp --source-root=/YOUR/ROOT/HERE/SOME-EMPTY-DIRECTORY /YOUR/ROOT/HERE/repro-bug1-db --overwrite

(2) codeql database trace-command /YOUR_ROOT_HERE/repro-bug1-db --working-dir=/YOUR/ROOT/HERE -- clang++ -DSK_CODEC_DECODES_JPEG_GAINMAPS -DSK_SHAPER_PRIMITIVE_AVAILABLE -DSKOTTIE_TRIVIAL_FONTRUN_ITER -DDCHECK_ALWAYS_ON=1 -DUSE_UDEV -DUSE_AURA=1 -DUSE_GLIB=1 -DUSE_OZONE=1 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_GNU_SOURCE -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE -DCR_CLANG_REVISION=\"llvmorg-19-init-14561-gecea8371-1\" -DCOMPONENT_BUILD -DCR_LIBCXX_REVISION=09b99fd8ab300c93ff7b8df6688cafb27bd3db28 -DCR_SYSROOT_KEY=20230611T210420Z-2 -D_DEBUG -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DSK_ENABLE_SKSL -DSK_UNTIL_CRBUG_1187654_IS_FIXED -DSK_WIN_FONTMGR_NO_SIMULATIONS -DSK_DISABLE_LEGACY_INIT_DECODERS -DSK_SLUG_DISABLE_LEGACY_DESERIALIZE -DSK_DISABLE_LEGACY_VULKAN_BACKENDSEMAPHORE -DSK_DISABLE_LEGACY_CREATE_CHARACTERIZATION -DSK_DISABLE_LEGACY_VULKAN_MUTABLE_TEXTURE_STATE -DSK_CODEC_DECODES_JPEG -DSK_ENCODE_JPEG -DSK_ENCODE_PNG -DSK_ENCODE_WEBP -DSKIA_DLL -DSKCMS_API=__attribute__\(\(visibility\(\"default\"\)\)\) -DSK_GANESH -DSK_GPU_WORKAROUNDS_HEADER=\"gpu/config/gpu_driver_bug_workaround_autogen.h\" -DSK_GL -DSK_VULKAN=1 -DSK_GRAPHITE -DSK_DAWN -DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_WAYLAND_KHR -DIS_SKIA_IMPL=1 -DSKIA_IMPLEMENTATION=1 -DSK_FREETYPE_MINIMUM_RUNTIME_VERSION_IS_BUILD_VERSION -DSK_TYPEFACE_FACTORY_FREETYPE -DSK_FONTMGR_FREETYPE_EMPTY_AVAILABLE -DSK_GAMMA_EXPONENT=1.2 -DSK_GAMMA_CONTRAST=0.2 -DSK_DEFAULT_FONT_CACHE_LIMIT=20971520 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_56 -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_56 -DWGPU_SHARED_LIBRARY -DABSL_CONSUME_DLL -DABSL_FLAGS_STRIP_NAMES=0 -DBORINGSSL_SHARED_LIBRARY -DWEBP_EXTERN=extern -DFT_CONFIG_MODULES_H=\"freetype-custom/freetype/config/ftmodule.h\" -DFT_CONFIG_OPTIONS_H=\"freetype-custom/freetype/config/ftoption.h\" -DPDFIUM_REQUIRED_MODULES -DUSE_LIBJPEG_TURBO=1 -DMANGLE_JPEG_NAMES -DU_USING_ICU_NAMESPACE=0 -DU_ENABLE_DYLOAD=0 -DUSE_CHROMIUM_ICU=1 -DU_ENABLE_TRACING=1 -DU_ENABLE_RESOURCE_TRACING=0 -DICU_UTIL_DATA_IMPL=ICU_UTIL_DATA_FILE -fno-delete-null-pointer-checks -fno-ident -fno-strict-aliasing -fstack-protector -funwind-tables -fPIC -pthread -fcolor-diagnostics -fmerge-all-constants -fno-sized-deallocation -mllvm -instcombine-lower-dbg-declare=0 -mllvm -split-threshold-for-reg-with-hint=0 -ffp-contract=off -fcomplete-member-pointers -m64 -msse3 -Wno-builtin-macro-redefined -D__DATE__= -D__TIME__= -D__TIMESTAMP__= -ffile-compilation-dir=. -no-canonical-prefixes -ftrivial-auto-var-init=pattern -O0 -fno-omit-frame-pointer -gdwarf-4 -g2 -gdwarf-aranges -gsplit-dwarf -ggnu-pubnames -fvisibility=hidden -Wheader-hygiene -Wstring-conversion -Wtautological-overlap-compare -DUNSAFE_BUFFERS_BUILD -Wno-redundant-parens -Wall -Wno-unused-variable -Wno-c++11-narrowing -Wno-unused-but-set-variable -Wno-misleading-indentation -Wno-missing-field-initializers -Wno-unused-parameter -Wno-psabi -Wloop-analysis -Wno-unneeded-internal-declaration -Wno-cast-function-type -Wno-ignored-pragma-optimize -Wno-deprecated-builtins -Wno-bitfield-constant-conversion -Wno-deprecated-this-capture -Wno-invalid-offsetof -Wno-vla-extension -Wno-thread-safety-reference-return -Werror -DPROTOBUF_ALLOW_DEPRECATED=1 -Wno-undefined-bool-conversion -Wno-tautological-undefined-compare -std=c++20 -Wno-trigraphs -gsimple-template-names -fno-exceptions -fno-rtti -nostdinc++ -fvisibility-inlines-hidden -Wenum-compare-conditional -Wno-c++11-narrowing-const-reference -Wno-missing-template-arg-list-after-template-kw -c ~/GrGlAttachment_ii.cpp -o ~/GrGLAttachment_ii.o

(3) codeql database finalize -j=-1 /YOUR/ROOT/HERE/repro-bug1-db.

At the conclusion of these steps there should be logs in build-tracer.log and logs/extractor/ indicating the failure.

In addition to (1) GrGlAttachment_ii.cpp.txt (the reproducer file), please find attached (2) the build-tracer.log and (3) the relevant extractor logfile (10d17.log) from running this on my own machine, which will hopefully be useful for debugging/triage.

I do have the logs for the entire Chromium build available upon request, but as you might imagine, those files are very large and may not be as useful to you as this standalone reproducer.

A fix for this bug (or, guidance on how we might be holding it wrong!) would be extremely helpful for us here in Chromium. Please let me know if you need any more information. Thank you!

10d17.log
build-tracer.log
GrGlAttachment_ii.cpp.txt

@flowerhack flowerhack added the question Further information is requested label Jun 27, 2024
@jketema
Copy link
Contributor

jketema commented Jul 4, 2024

Hi,

Thanks for the report.

Warning[extractor-c++]: In index_expr_node: Unknown expr kind 30.

These are just warnings that can be safely ignored. It's effectively a symptom of us not extracting concepts. We know we have to fix this, but as concepts are not really relevant for detecting security issues this has not taken priority. If the extractor exits with exit code 1 here, is due to one of the parse errors from earlier in the log:

"../../third_party/libc++/src/include/__atomic/atomic_ref.h", line 108: error: expression must have a constant value
        __atomic_always_lock_free(sizeof(_Tp), reinterpret_cast<void*>(-required_alignment));
                                                                       ^

[E 02:01:07 95483] Warning[extractor-c++]: In construct_text_message: "../../third_party/libc++/src/include/__atomic/atomic_ref.h", line 108: error: expression must have a constant value
        __atomic_always_lock_free(sizeof(_Tp), reinterpret_cast<void*>(-required_alignment));
                                                                       ^


"../../third_party/libc++/src/include/__type_traits/is_trivially_relocatable.h", line 29: error: type name is not allowed
  struct __libcpp_is_trivially_relocatable : integral_constant<bool, __is_trivially_relocatable(_Tp)> {};
...

Assuming you have or will report those parse errors separately, I'd like to close this issue if that's ok with you.

@jketema jketema added the C++ label Jul 5, 2024
@flowerhack
Copy link
Author

Oh, that's good to know, thank you. Please feel free to close this issue.

(Also, for my edification: does "Unknown expr kind 30" refer to concepts specifically? Or do all log messages of the form "Unknown expr kind $SOME_NUMBER" correspond to concepts? I've seen a variety of errors of this sort (as well as "Unknown routine kind $NUMBER", "Unexpected dynamic init kind $NUMBER", and it'd be nice to know whether those are all the same thing, or if they're likely different things that may be worth reporting.)

@flowerhack
Copy link
Author

flowerhack commented Jul 9, 2024

In particular, the specific "unknown/unexpected" things I've been seeing are:

  • Unexpected template kind 9.
  • Unknown routine kind 7.
  • Unexpected dynamic init kind 7.
  • Unknown routine kind 6.
  • Unknown routine kind 4.
  • Unexpected dynamic init kind 1.
  • Unexpected dynamic init kind 2.
  • Unexpected dynamic init kind 3.
  • Unexpected dynamic init kind 6.
  • Unknown expr kind 31.
  • Unknown expr kind 34.
  • Unknown kind 5.
  • Unrecognized builtin operation kind 60.
  • Unrecognized builtin operation kind 98.
  • Unrecognized builtin operation kind 102.

If any of these are already known to be ignorable, or known to be safe bugs, do let me know. If these are new to you I will plan to file bugs for them. Thanks!

@jketema
Copy link
Contributor

jketema commented Jul 9, 2024

They're due to a variety of reasons.

These are all related to concepts:

  • Unexpected template kind 9.
  • Unknown expr kind 31.
  • Unknown expr kind 34.

These are related to compiler generated initialisations, most (if not all) in compiler generated constructors, which we don't do anything with in queries, so not problematic:

  • Unexpected dynamic init kind 1.
  • Unexpected dynamic init kind 2.
  • Unexpected dynamic init kind 3.
  • Unexpected dynamic init kind 6.
  • Unexpected dynamic init kind 7.

These are a couple of cases where where we don't identify the routine type correctly (the routine will still end up in the database). I believe that some of these should be fixed with the latest CodeQL version.

  • Unknown routine kind 7.
  • Unknown routine kind 6.
  • Unknown routine kind 4.

This one is related to some synthetic attribute that our frontend generates, and we should silence it:

  • Unknown kind 5.

For the following could you open a new issue (a single one which mentions all three suffices). Just providing log output in which they occur should be enough (I know how to reproduce them).

  • Unrecognized builtin operation kind 60.
  • Unrecognized builtin operation kind 98.
  • Unrecognized builtin operation kind 102.

@jketema
Copy link
Contributor

jketema commented Jul 9, 2024

Closing this as discussed above.

@jketema jketema closed this as completed Jul 9, 2024
@jketema
Copy link
Contributor

jketema commented Jul 11, 2024

For the following could you open a new issue (a single one which mentions all three suffices). Just providing log output in which they occur should be enough (I know how to reproduce them).

  • Unrecognized builtin operation kind 60.
  • Unrecognized builtin operation kind 98.
  • Unrecognized builtin operation kind 102.

These 3 will be fixed in CodeQL 2.18.1. The public facing part of this is #16951.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants