[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs #107332

JustinStitt · 2024-09-04T23:56:10Z

Summary

Implement type-based filtering via Sanitizer Special Case Lists for the arithmetic overflow and truncation sanitizers.

Currently, using the type: prefix with these sanitizers does nothing. I've hooked up the SSCL parsing with Clang codegen so that we don't emit the overflow/truncation checks if the arithmetic contains an ignored type.

Usefulness

You can craft ignorelists that ignore specific types that are expected to overflow or wrap-around. For example, to ignore my_type from unsigned-integer-overflow instrumentation:

$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type=no_sanitize

$ cat foo.c
typedef unsigned long my_type;
void foo() {
  my_type a = ULONG_MAX;
  ++a;
}

$ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --> no sanitizer error

If a type is functionally intended to overflow, like refcount_t and its associated APIs in the Linux kernel, then this type filtering would prove useful for reducing sanitizer noise. Currently, the Linux kernel dealt with this by littering __attribute__((no_sanitize("signed-integer-overflow"))) annotations on all the refcount_t APIs. I think this serves as an example of how a codebase could be made cleaner. We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations. This accomplishes a similar goal to #86618.

Yet another use case for this type filtering is whitelisting. We could ignore all types, save a few.

$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:*=no_sanitize # ignore literally all types
type:short=sanitize # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}

Other ways to accomplish the goal of sanitizer allowlisting/whitelisting

ignore list SSCL type support (this PR that you're reading)
my sanitize-allowlist branch - this just implements a sibling flag -fsanitize-allowlist=, removing some of the double negative logic present with skip/ignore when trying to whitelist something.
Glob Negation - Implement a negation operator to the GlobPattern class so the ignorelist query can use them to simulate allowlisting

Please let me know which of the three options we like best. They are not necessarily mutually exclusive.

Here's another related PR which implements a wraps attribute. This can accomplish a similar goal to this PR but requires in-source changes to codebases and also covers a wider variety of integer definedness problems.

CCs

@kees @vitalybuka @bwendling

llvmbot · 2024-09-04T23:56:39Z

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Justin Stitt (JustinStitt)

Changes

Related RFC

Summary

Implement type-based filtering via Sanitizer Special Case Lists for the arithmetic overflow and truncation sanitizers.

Currently, using the type: prefix with these sanitizers does nothing. I've hooked up the SSCL parsing with Clang codegen so that we don't emit the overflow/truncation checks if the arithmetic contains an ignored type.

Usefulness

You can craft ignorelists that ignore specific types that are expected to overflow or wrap-around. For example, to ignore my_type from unsigned-integer-overflow instrumentation:

$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type

$ cat foo.c
typedef int my_type;
void foo() {
  my_type a = 2147483647;
  ++a;
}

$ clang foo.c -fsanitize=signed-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --&gt; no sanitizer error

If a type is functionally intended to overflow, like refcount_t and its associated APIs in the Linux kernel, then this type filtering would prove useful for reducing sanitizer noise. Currently, the Linux kernel dealt with this by littering __attribute__((no_sanitize("signed-integer-overflow"))) annotations on all the refcount_t APIs. I think this serves as an example of how a codebase could be made cleaner. We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations. This accomplishes a similar goal to #86618.

Yet another use case for this type filtering is whitelisting. We could ignore all types, save a few.

$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:* # ignore literally all types
type:short=skip # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}

Other ways to accomplish the goal of sanitizer allowlisting/whitelisting

ignore list SSCL type support (this PR that you're reading)
my sanitize-allowlist branch - this just implements a sibling flag -fsanitize-allowlist=, removing some of the double negative logic present with skip/ignore when trying to whitelist something.
Glob Negation - Implement a negation operator to the GlobPattern class so the ignorelist query can use them to simulate allowlisting

Please let me know which of the three options we like best. They are not necessarily mutually exclusive.

Here's another related PR which implements a wraps attribute. This can accomplish a similar goal to this PR but requires in-source changes to codebases and also covers a wider variety of integer definedness problems.

CCs

@kees @vitalybuka @bwendling

Full diff: https://github.com/llvm/llvm-project/pull/107332.diff

6 Files Affected:

(modified) clang/docs/ReleaseNotes.rst (+8)
(modified) clang/docs/SanitizerSpecialCaseList.rst (+60-2)
(modified) clang/include/clang/AST/ASTContext.h (+3)
(modified) clang/lib/AST/ASTContext.cpp (+31)
(modified) clang/lib/CodeGen/CGExprScalar.cpp (+32-4)
(modified) clang/test/CodeGen/ubsan-type-ignorelist.cpp (+76-8)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0eee71d00a2c5f..0f036e90e06c5e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -540,6 +540,14 @@ Sanitizers
   This new flag should allow those projects to enable integer sanitizers with
   less noise.
 
+- Arithmetic overflow sanitizers ``-fsanitize=signed-integer-overflow`` and
+  ``-fsanitize=unsigned-integer-overflow`` as well as the implicit integer
+  truncation sanitizers ``-fsanitize=implicit-signed-integer-truncation`` and
+  ``-fsanitize=implicit-unsigned-integer-truncation`` now properly support the
+  "type" prefix within `Sanitizer Special Case Lists (SSCL)
+  <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html>`_. See that link
+  for examples.
+
 Python Binding Changes
 ----------------------
 - Fixed an issue that led to crashes when calling ``Type.get_exception_specification_kind``.
diff --git a/clang/docs/SanitizerSpecialCaseList.rst b/clang/docs/SanitizerSpecialCaseList.rst
index c7fb0fa3f8a828..13f71b878d4cc8 100644
--- a/clang/docs/SanitizerSpecialCaseList.rst
+++ b/clang/docs/SanitizerSpecialCaseList.rst
@@ -15,8 +15,9 @@ file at compile-time.
 Goal and usage
 ==============
 
-Users of sanitizer tools, such as :doc:`AddressSanitizer`, :doc:`ThreadSanitizer`
-or :doc:`MemorySanitizer` may want to disable or alter some checks for
+Users of sanitizer tools, such as :doc:`AddressSanitizer`,
+:doc:`ThreadSanitizer`, :doc:`MemorySanitizer` or :doc:
+`UndefinedBehaviorSanitizer` may want to disable or alter some checks for
 certain source-level entities to:
 
 * speedup hot function, which is known to be correct;
@@ -48,6 +49,63 @@ Example
   $ clang -fsanitize=address -fsanitize-ignorelist=ignorelist.txt foo.c ; ./a.out
   # No error report here.
 
+Usage with UndefinedBehaviorSanitizer
+=====================================
+
+The arithmetic overflow sanitizers ``unsigned-integer-overflow`` and
+``signed-integer-overflow`` as well as the implicit integer truncation
+sanitizers ``implicit-signed-integer-truncation`` and
+``implicit-unsigned-integer-truncation`` support the ability to adjust
+instrumentation based on type.
+
+.. code-block:: bash
+
+  $ cat foo.c
+  void foo() {
+    int a = 2147483647; // INT_MAX
+    ++a;                // Normally, an overflow with -fsanitize=signed-integer-overflow
+  }
+  $ cat ignorelist.txt
+  [signed-integer-overflow]
+  type:int
+  $ clang -fsanitize=signed-integer-overflow -fsanitize-ignorelist=ignorelist.txt foo.c ; ./a.out
+  # no signed-integer-overflow error
+
+Supplying ``ignorelist.txt`` with ``-fsanitize-ignorelist=ignorelist.txt``
+disables overflow sanitizer instrumentation for arithmetic operations
+containing values of type ``int``, for example. Custom types may be used.
+
+The following SCL categories are supported: ``=allow``, ``=skip`` and
+``=forbid``. The ``allow`` category is the default for any entry and specifies
+that the query, if matched, will have its sanitizer instrumentation ignored.
+Conversely, both ``skip`` and ``forbid`` cause their queries, if matched, to be
+left out of the ignorelist -- essentially ensuring sanitizer instrumentation
+remains for those types. This is useful for whitelisting specific types.
+
+With this, one may disable instrumentation for all types and specifically allow
+instrumentation for one or many types.
+
+.. code-block:: bash
+
+  $ cat ignorelist.txt
+  [implicit-signed-integer-truncation]
+  type:*=allow
+  type:T=skip
+  $ cat foo.c
+  typedef char T;
+  typedef char U;
+  void foo(int toobig) {
+    T a = toobig;    // instrumented
+    U b = toobig;    // not instrumented
+    char c = toobig; // also not instrumented
+  }
+
+Note that ``skip`` and ``forbid`` operate exactly the same in this context and
+both options exist simply for conformity with the `-fprofile-list
+<https://clang.llvm.org/docs/UsersManual.html#instrumenting-only-selected-files-or-functions>`_
+syntax for adjusting profile instrumentation. You do not need to specify any
+`default:<type>` for ``-fsanitize-ignorelist`` SSCLs, though.
+
 Format
 ======
 
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 89bb5768dbd40d..a9373c6102c010 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -807,6 +807,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
 
   const NoSanitizeList &getNoSanitizeList() const { return *NoSanitizeL; }
 
+  bool isTypeIgnoredBySanitizer(const SanitizerMask &Mask,
+                                const QualType &Ty) const;
+
   const XRayFunctionFilter &getXRayFilter() const {
     return *XRayFilter;
   }
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index c61234aa4d1af1..bcda5fad585d67 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -817,6 +817,37 @@ ASTContext::getCanonicalTemplateTemplateParmDecl(
   return CanonTTP;
 }
 
+/// Check if a type can have its sanitizer instrumentation elided.
+/// Determine this by its presence in a SCL alongside its specified categories.
+/// For example:
+/// ignorelist.txt>
+/// [{unsigned-integer-overflow,signed-integer-overflow}]
+/// type:*
+/// type:size_t=skip
+/// <ignorelist.txt
+/// Supplying the above ignorelist.txt will disable overflow sanitizer
+/// instrumentation for all types except "size_t".
+bool ASTContext::isTypeIgnoredBySanitizer(const SanitizerMask &Mask,
+                                          const QualType &Ty) const {
+  // One may specifically allow a type "type:foo=allow"
+  bool isAllowedBySCL =
+      NoSanitizeL->containsType(Mask, Ty.getAsString(), "allow");
+
+  // There could also be no category present "type:foo", which is the same as
+  // "allow"
+  isAllowedBySCL |= NoSanitizeL->containsType(Mask, Ty.getAsString());
+
+  // Explicitly specifying "skip" is also possible "type:foo=skip"
+  bool isSkippedBySCL =
+      NoSanitizeL->containsType(Mask, Ty.getAsString(), "skip");
+
+  // Or "forbid", as there is currently no distinction between "skip" and
+  // "forbid" for the purposes of the overflow/truncation sanitizer ignorelist.
+  isSkippedBySCL |= NoSanitizeL->containsType(Mask, Ty.getAsString(), "forbid");
+
+  return isAllowedBySCL && !isSkippedBySCL;
+}
+
 TargetCXXABI::Kind ASTContext::getCXXABIKind() const {
   auto Kind = getTargetInfo().getCXXABI().getKind();
   return getLangOpts().CXXABI.value_or(Kind);
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index af11bc20a3b639..21e932a3639f73 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -196,6 +196,18 @@ static bool CanElideOverflowCheck(const ASTContext &Ctx, const BinOpInfo &Op) {
   if (!Op.mayHaveIntegerOverflow())
     return true;
 
+  if (Op.Ty->isSignedIntegerType() &&
+      Ctx.isTypeIgnoredBySanitizer(SanitizerKind::SignedIntegerOverflow,
+                                   Op.Ty)) {
+    return true;
+  }
+
+  if (Op.Ty->isUnsignedIntegerType() &&
+      Ctx.isTypeIgnoredBySanitizer(SanitizerKind::UnsignedIntegerOverflow,
+                                   Op.Ty)) {
+    return true;
+  }
+
   const UnaryOperator *UO = dyn_cast<UnaryOperator>(Op.E);
 
   if (UO && UO->getOpcode() == UO_Minus &&
@@ -1120,6 +1132,10 @@ void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
   if (!CGF.SanOpts.has(Check.second.second))
     return;
 
+  // Does some SSCL ignore this type?
+  if (CGF.getContext().isTypeIgnoredBySanitizer(Check.second.second, DstType))
+    return;
+
   llvm::Constant *StaticArgs[] = {
       CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
       CGF.EmitCheckTypeDescriptor(DstType),
@@ -1230,6 +1246,15 @@ void ScalarExprEmitter::EmitIntegerSignChangeCheck(Value *Src, QualType SrcType,
     // Because here sign change check is interchangeable with truncation check.
     return;
   }
+  // Does an SSCL have an entry for the DstType under its respective sanitizer
+  // section?
+  if (DstSigned && CGF.getContext().isTypeIgnoredBySanitizer(
+                       SanitizerKind::ImplicitSignedIntegerTruncation, DstType))
+    return;
+  if (!DstSigned &&
+      CGF.getContext().isTypeIgnoredBySanitizer(
+          SanitizerKind::ImplicitUnsignedIntegerTruncation, DstType))
+    return;
   // That's it. We can't rule out any more cases with the data we have.
 
   CodeGenFunction::SanitizerScope SanScope(&CGF);
@@ -2770,10 +2795,11 @@ llvm::Value *ScalarExprEmitter::EmitIncDecConsiderOverflowBehavior(
       return Builder.CreateNSWAdd(InVal, Amount, Name);
     [[fallthrough]];
   case LangOptions::SOB_Trapping:
-    if (!E->canOverflow())
+    BinOpInfo Info = createBinOpInfoFromIncDec(
+        E, InVal, IsInc, E->getFPFeaturesInEffect(CGF.getLangOpts()));
+    if (!E->canOverflow() || CanElideOverflowCheck(CGF.getContext(), Info))
       return Builder.CreateNSWAdd(InVal, Amount, Name);
-    return EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
-        E, InVal, IsInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
+    return EmitOverflowCheckedBinOp(Info);
   }
   llvm_unreachable("Unknown SignedOverflowBehaviorTy");
 }
@@ -2973,7 +2999,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
       value = EmitIncDecConsiderOverflowBehavior(E, value, isInc);
     } else if (E->canOverflow() && type->isUnsignedIntegerType() &&
                CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
-               !excludeOverflowPattern) {
+               !excludeOverflowPattern &&
+               !CGF.getContext().isTypeIgnoredBySanitizer(
+                   SanitizerKind::UnsignedIntegerOverflow, E->getType())) {
       value = EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
           E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
     } else {
diff --git a/clang/test/CodeGen/ubsan-type-ignorelist.cpp b/clang/test/CodeGen/ubsan-type-ignorelist.cpp
index 7bea84f5fabb93..e8b14d5c415222 100644
--- a/clang/test/CodeGen/ubsan-type-ignorelist.cpp
+++ b/clang/test/CodeGen/ubsan-type-ignorelist.cpp
@@ -1,7 +1,28 @@
+// Verify ubsan doesn't emit checks for ignorelisted types
+// RUN: echo "[{unsigned-integer-overflow,signed-integer-overflow}]" > %t-int.ignorelist
+// RUN: echo "type:int" >> %t-int.ignorelist
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t-int.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=INT
+
+// RUN: echo "type:int" > %t-nosection.ignorelist
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t-nosection.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=INT
+
+// RUN: echo "type:int=allow" > %t-allow-same-as-no-category.ignorelist
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t-allow-same-as-no-category.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=INT
+
+// RUN: echo "[{unsigned-integer-overflow,signed-integer-overflow}]" > %t-myty.ignorelist
+// RUN: echo "type:*" >> %t-myty.ignorelist
+// RUN: echo "type:myty=skip" >> %t-myty.ignorelist
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow,unsigned-integer-overflow -fsanitize-ignorelist=%t-myty.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=MYTY
+
+// RUN: echo "[{implicit-signed-integer-truncation,implicit-unsigned-integer-truncation}]" > %t-trunc.ignorelist
+// RUN: echo "type:char" >> %t-trunc.ignorelist
+// RUN: echo "type:unsigned char" >> %t-trunc.ignorelist
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=implicit-signed-integer-truncation,implicit-unsigned-integer-truncation -fsanitize-ignorelist=%t-trunc.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=TRUNC
+
 // Verify ubsan vptr does not check down-casts on ignorelisted types.
 // RUN: echo "type:_ZTI3Foo" > %t-type.ignorelist
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=vptr -fsanitize-recover=vptr -emit-llvm %s -o - | FileCheck %s --check-prefix=DEFAULT
-// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=vptr -fsanitize-recover=vptr -fsanitize-ignorelist=%t-type.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=TYPE
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=vptr -fsanitize-recover=vptr -emit-llvm %s -o - | FileCheck %s --check-prefix=VPTR
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=vptr -fsanitize-recover=vptr -fsanitize-ignorelist=%t-type.ignorelist -emit-llvm %s -o - | FileCheck %s --check-prefix=VPTR-TYPE
 
 class Bar {
 public:
@@ -11,13 +32,60 @@ class Foo : public Bar {};
 
 Bar bar;
 
-// DEFAULT: @_Z7checkmev
-// TYPE: @_Z7checkmev
+// VPTR: @_Z7checkmev
+// VPTR-TYPE: @_Z7checkmev
 void checkme() {
-// DEFAULT: call void @__ubsan_handle_dynamic_type_cache_miss({{.*}} (ptr @bar to
-// TYPE-NOT: @__ubsan_handle_dynamic_type_cache_miss
+// VPTR: call void @__ubsan_handle_dynamic_type_cache_miss({{.*}} (ptr @bar to
+// VPTR-TYPE-NOT: @__ubsan_handle_dynamic_type_cache_miss
   Foo* foo = static_cast<Foo*>(&bar); // down-casting
-// DEFAULT: ret void
-// TYPE: ret void
+// VPTR: ret void
+// VPTR-TYPE: ret void
   return;
 }
+
+// INT-LABEL: ignore_int
+void ignore_int(int A, int B, unsigned C, unsigned D, long E) {
+// INT: llvm.uadd.with.overflow.i32
+  (void)(C+D);
+// INT-NOT: llvm.sadd.with.overflow.i32
+  (void)(A+B);
+// INT: llvm.sadd.with.overflow.i64
+  (void)(++E);
+}
+
+
+typedef unsigned long myty;
+typedef myty derivative;
+// INT-LABEL: ignore_all_except_myty
+// MYTY-LABEL: ignore_all_except_myty
+void ignore_all_except_myty(myty A, myty B, int C, unsigned D, derivative E) {
+// MYTY-NOT: llvm.sadd.with.overflow.i32
+  (void)(++C);
+
+// MYTY-NOT: llvm.uadd.with.overflow.i32
+  (void)(D+D);
+
+// MYTY-NOT: llvm.umul.with.overflow.i64
+  (void)(E*2);
+
+// MYTY: llvm.uadd.with.overflow.i64
+  (void)(A+B);
+}
+
+// INT-LABEL: truncation
+// MYTY-LABEL: truncation
+// TRUNC-LABEL: truncation
+void truncation(char A, int B, unsigned char C, short D) {
+// TRUNC-NOT: %handler.implicit_conversion
+  A = B;
+// TRUNC-NOT: %handler.implicit_conversion
+  A = C;
+// TRUNC-NOT: %handler.implicit_conversion
+  C = B;
+
+// TRUNC: %handler.implicit_conversion
+  D = B;
+
+  (void)A;
+  (void)D;
+}

clang/docs/SanitizerSpecialCaseList.rst

clang/lib/AST/ASTContext.cpp

clang/test/CodeGen/ubsan-type-ignorelist.cpp

Signed-off-by: Justin Stitt <[email protected]>

melver · 2024-10-30T21:15:44Z

We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations

Having this sort of semantic information detached from the code may cause some maintenance headaches. For example, if the type is renamed, developers have to remember to adjust the filter list as well.

I wished that we could just attach attributes to type, e.g. typedef int __attribute__((no_sanitize("signed-integer-overflow")) wrapping_int or something. One thing here is that this should not be a type modifier (like volatile and such), so it does not change the type, but that means nothing enforces it's propagated through conversions. But I suppose that the filter list has the same problem as you depend on specific type names.

Just something to consider - I suspect it's much more complex to implement in Clang though. So that's a +1 for the filterlist in terms of complexity.

clang/docs/SanitizerSpecialCaseList.rst

JustinStitt · 2024-10-30T21:53:25Z

I wished that we could just attach attributes to type, e.g. typedef int __attribute__((no_sanitize("signed-integer-overflow")) wrapping_int or something. One thing here is that this should not be a type modifier (like volatile and such), so it does not change the type, but that means nothing enforces it's propagated through conversions. But I suppose that the filter list has the same problem as you depend on specific type names.

I have the implementation done for this and I should get the PR up soon.

Then we'll have something like this:

typedef int __attribute__((wraps)) wrapping_int

I'll be sure to CC you 😄

melver · 2024-10-30T21:58:19Z

I wished that we could just attach attributes to type, e.g. typedef int __attribute__((no_sanitize("signed-integer-overflow")) wrapping_int or something. One thing here is that this should not be a type modifier (like volatile and such), so it does not change the type, but that means nothing enforces it's propagated through conversions. But I suppose that the filter list has the same problem as you depend on specific type names.

I have the implementation done for this and I should get the PR up soon.

Then we'll have something like this:
typedef int __attribute__((wraps)) wrapping_int
I'll be sure to CC you 😄

Oh nice. If you have that almost done, why do you still want the filterlist? For the Linux kernel usecase, wouldn't the attribute be preferrable? My intuition tells me that maintaining a filterlist detached from the code will cause headaches, esp. during refactors or code moves (and you don't want to be the one chasing after the breakages).

vitalybuka · 2024-10-30T21:58:22Z

@vitalybuka Can we move forward with this? The most recent commit uses the =sanitize =no_sanitize wording that you prefer.

yes, we can
you should press "re-request review" after update, or your PR does not show up in "review-requested:@me" searches.

clang/lib/CodeGen/CGExprScalar.cpp

vitalybuka · 2024-10-30T22:07:24Z

Is "Test documentation build" error related?

clang/test/CodeGen/ubsan-type-ignorelist.test

clang/docs/SanitizerSpecialCaseList.rst

clang/test/CodeGen/ubsan-type-ignorelist.cpp

vitalybuka

Please don't forget to "re-request" after update, fell free to ping me in chat as well.

vitalybuka

Actually LGTM,
Request for doc and test is quite straightforward. No need to block on me.

clang/lib/AST/ASTContext.cpp

Signed-off-by: Justin Stitt <[email protected]>

clang/lib/AST/ASTContext.cpp

Signed-off-by: Justin Stitt <[email protected]>

JustinStitt · 2024-10-31T05:04:38Z

@melver

Oh nice. If you have that almost done, why do you still want the filterlist? For the Linux kernel usecase, wouldn't the attribute be preferrable? My intuition tells me that maintaining a filterlist detached from the code will cause headaches, esp. during refactors or code moves (and you don't want to be the one chasing after the breakages).

We need both the filterlist and attribute support because of the agreed upon ideology about how integer overflow is reasoned about in the Linux kernel.

Basically, the compromise that was reached involves marking all types as wrapping so that the sanitizers won't warn for just any arithmetic. Following this, specific types (like size_t) will be marked as no_wraps which means the sanitizers will do their thing. The best way to "mark all types" is with filterlists; and the best way to mark one (or a few) types is with in-source attributes.

I know @kees feels strongly that this should be the other way around: Instrument everything by default (which would eliminate the need for filterlist integration entirely) and exclude things from instrumentation with wraps. Either way, my patch set implements both wraps and no_wraps and folks can use whichever they please. Using wraps/no_wraps will not force users to interact with or use ignorelists at all -- those attributes will simply integrate nicely with ignorelists if supplied.

Signed-off-by: Justin Stitt <[email protected]>

vitalybuka · 2024-11-01T21:49:04Z

I know @kees feels strongly that this should be the other way around: Instrument everything by default (which would eliminate the need for filterlist integration entirely) and exclude things from instrumentation with wraps.

My intuition is that @kees approach could be more efficient to get things done :)
Maintaining a list of types can be annoying.

JustinStitt · 2024-11-01T22:24:14Z

My intuition is that @kees approach could be more efficient to get things done :) Maintaining a list of types can be annoying.

We discussed this at Linux Plumbers '24 and also on lkml where an eventual lwn.net article was published summarizing the discussion.

I recommend reading that article for the most succinct overview of Linus' thoughts on the matter. From our talk at Plumbers, other maintainers agree with Linus: we should limit sanitizer instrumentation across all types and simply whitelist a few.

Thankfully, this PR and my other PR (WIP) leave the door open. You can ignore all types and then mark a few as no_wraps or you can sanitize all types (the default) and mark a few as wraps.

kees · 2024-11-01T22:39:49Z

Thankfully, this PR and my other PR (WIP) leave the door open. You can ignore all types and then mark a few as no_wraps or you can sanitize all types (the default) and mark a few as wraps.

Right, while I want to go full instrumentation, it's just not going to happen in the Linux kernel. We're going to need both options so that we can slowly overlap coverage until we've squeezed out all the unexpected wrap-around.

clang/docs/SanitizerSpecialCaseList.rst

vitalybuka · 2024-11-01T22:55:52Z

Right, while I want to go full instrumentation, it's just not going to happen in the Linux kernel. We're going to need both options so that we can slowly overlap coverage until we've squeezed out all the unexpected wrap-around.

That was just my 2c. Linux decision about is need needed to proceed here.
The patch LGTM.

I believe it could be better without no_sanitize. Please share if some one else have opinion on that?
Also let's ensure consistently between no_wrap/no_sanitize.

BTW, what is the plan for wrap? Is this going to be this list or another one?
Does Kernel instrument in recovery mode or always crash?
In recoverable mode sanitizer/no_sanitize is independent from wrap/no_wrap.
In non-recovery mode wrap is not needed for sanitized types

Signed-off-by: Justin Stitt <[email protected]>

vitalybuka · 2024-11-04T07:57:41Z

Thanks!

vitalybuka · 2024-11-04T08:24:41Z

@JustinStitt Can't assign you as reviewer, but please take a look #114754

llvm-ci · 2024-11-04T08:33:51Z

LLVM Buildbot has detected a new failure on builder lldb-remote-linux-ubuntu running on as-builder-9 while building clang at step 16 "test-check-lldb-api".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/195/builds/556

Here is the relevant piece of the build log for the reference

Step 16 (test-check-lldb-api) failure: Test just built components: check-lldb-api completed (failure)
...
PASS: lldb-api :: types/TestIntegerType.py (1199 of 1208)
PASS: lldb-api :: types/TestIntegerTypeExpr.py (1200 of 1208)
PASS: lldb-api :: types/TestRecursiveTypes.py (1201 of 1208)
PASS: lldb-api :: types/TestShortType.py (1202 of 1208)
PASS: lldb-api :: types/TestShortTypeExpr.py (1203 of 1208)
PASS: lldb-api :: types/TestLongTypes.py (1204 of 1208)
PASS: lldb-api :: types/TestLongTypesExpr.py (1205 of 1208)
PASS: lldb-api :: tools/lldb-server/TestNonStop.py (1206 of 1208)
PASS: lldb-api :: tools/lldb-server/TestLldbGdbServer.py (1207 of 1208)
UNRESOLVED: lldb-api :: tools/lldb-server/TestGdbRemoteProcessInfo.py (1208 of 1208)
******************** TEST 'lldb-api :: tools/lldb-server/TestGdbRemoteProcessInfo.py' FAILED ********************
Script:
--
/usr/bin/python3.12 /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --libcxx-include-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/c++/v1 --libcxx-include-target-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/aarch64-unknown-linux-gnu/c++/v1 --libcxx-library-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib/aarch64-unknown-linux-gnu --arch aarch64 --build-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/lldb --compiler /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang --dsymutil /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/dsymutil --make /usr/bin/make --llvm-tools-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --lldb-obj-root /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/tools/lldb --lldb-libs-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --platform-url connect://jetson-agx-2198.lab.llvm.org:1234 --platform-working-dir /home/ubuntu/lldb-tests --sysroot /mnt/fs/jetson-agx-ubuntu --env ARCH_CFLAGS=-mcpu=cortex-a78 --platform-name remote-linux /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/tools/lldb-server -p TestGdbRemoteProcessInfo.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 3dd1d888fb18b40859504e207e57abdc6c43d838)
  clang revision 3dd1d888fb18b40859504e207e57abdc6c43d838
  llvm revision 3dd1d888fb18b40859504e207e57abdc6c43d838
Setting up remote platform 'remote-linux'
Connecting to remote platform 'remote-linux' at 'connect://jetson-agx-2198.lab.llvm.org:1234'...
Connected.
Setting remote platform working directory to '/home/ubuntu/lldb-tests'...
Skipping the following test categories: ['dsym', 'gmodules', 'debugserver', 'objc', 'lldb-dap']
connect to debug monitor on port 12867 failed, attempt #1 of 20
connect to debug monitor on port 12657 failed, attempt #2 of 20
connect to debug monitor on port 15682 failed, attempt #3 of 20
connect to debug monitor on port 12047 failed, attempt #4 of 20
connect to debug monitor on port 13012 failed, attempt #5 of 20
connect to debug monitor on port 12726 failed, attempt #6 of 20
connect to debug monitor on port 13327 failed, attempt #7 of 20
connect to debug monitor on port 12066 failed, attempt #8 of 20
connect to debug monitor on port 12055 failed, attempt #9 of 20
connect to debug monitor on port 14880 failed, attempt #10 of 20
connect to debug monitor on port 14710 failed, attempt #11 of 20
connect to debug monitor on port 14229 failed, attempt #12 of 20
connect to debug monitor on port 12882 failed, attempt #13 of 20
connect to debug monitor on port 14594 failed, attempt #14 of 20
connect to debug monitor on port 13140 failed, attempt #15 of 20
connect to debug monitor on port 13353 failed, attempt #16 of 20
connect to debug monitor on port 14585 failed, attempt #17 of 20
connect to debug monitor on port 12057 failed, attempt #18 of 20
connect to debug monitor on port 13687 failed, attempt #19 of 20
connect to debug monitor on port 12328 failed, attempt #20 of 20

--

JustinStitt · 2024-11-04T20:33:33Z

@vitalybuka any idea how lldb-server is failing with UNRESOLVED? How in the world are any of the changes here affecting lldb-server?

Similar to #107332.

@kees

…itizers w/ SSCLs (llvm#107332) [Related RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt) ### Summary Implement type-based filtering via [Sanitizer Special Case Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for the arithmetic overflow and truncation sanitizers. Currently, using the `type:` prefix with these sanitizers does nothing. I've hooked up the SSCL parsing with Clang codegen so that we don't emit the overflow/truncation checks if the arithmetic contains an ignored type. ### Usefulness You can craft ignorelists that ignore specific types that are expected to overflow or wrap-around. For example, to ignore `my_type` from `unsigned-integer-overflow` instrumentation: ```bash $ cat ignorelist.txt [unsigned-integer-overflow] type:my_type=no_sanitize $ cat foo.c typedef unsigned long my_type; void foo() { my_type a = ULONG_MAX; ++a; } $ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out // --> no sanitizer error ``` If a type is functionally intended to overflow, like [refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t) and its associated APIs in the Linux kernel, then this type filtering would prove useful for reducing sanitizer noise. Currently, the Linux kernel dealt with this by [littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139 ) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations on all the `refcount_t` APIs. I think this serves as an example of how a codebase could be made cleaner. We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations. This accomplishes a similar goal to llvm#86618. Yet another use case for this type filtering is whitelisting. We could ignore _all_ types, save a few. ```bash $ cat ignorelist.txt [implicit-signed-integer-truncation] type:*=no_sanitize # ignore literally all types type:short=sanitize # except `short` $ cat bar.c // compile with -fsanitize=implicit-signed-integer-truncation void bar(int toobig) { char a = toobig; // not instrumented short b = toobig; // instrumented } ``` ### Other ways to accomplish the goal of sanitizer allowlisting/whitelisting * ignore list SSCL type support (this PR that you're reading) * [my sanitize-allowlist branch](llvm/llvm-project@main...JustinStitt:llvm-project:sanitize-allowlist) - this just implements a sibling flag `-fsanitize-allowlist=`, removing some of the double negative logic present with `skip`/`ignore` when trying to whitelist something. * [Glob Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683) - Implement a negation operator to the GlobPattern class so the ignorelist query can use them to simulate allowlisting Please let me know which of the three options we like best. They are not necessarily mutually exclusive. Here's [another related PR](llvm#86618) which implements a `wraps` attribute. This can accomplish a similar goal to this PR but requires in-source changes to codebases and also covers a wider variety of integer definedness problems. ### CCs @kees @vitalybuka @bwendling --------- Signed-off-by: Justin Stitt <[email protected]>

Similar to llvm#107332.

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen labels Sep 4, 2024

vitalybuka requested review from vitalybuka and MaskRay September 5, 2024 22:17

JustinStitt changed the title ~~[Clang] Implement type filtering for overflow/truncation sanitizers w/ SSCLs~~ [Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs Sep 16, 2024

vitalybuka reviewed Sep 18, 2024

View reviewed changes

clang/docs/SanitizerSpecialCaseList.rst Outdated Show resolved Hide resolved

clang/lib/AST/ASTContext.cpp Outdated Show resolved Hide resolved

clang/test/CodeGen/ubsan-type-ignorelist.cpp Outdated Show resolved Hide resolved

JustinStitt added 4 commits September 25, 2024 11:47

hook up sscl categories with overflow/truncation sanitizers

548efc6

Signed-off-by: Justin Stitt <[email protected]>

add docs

6efbd7b

Signed-off-by: Justin Stitt <[email protected]>

fix formatting on docs

21be60a

Signed-off-by: Justin Stitt <[email protected]>

use split-file for tests

ce2a808

Signed-off-by: Justin Stitt <[email protected]>

JustinStitt force-pushed the scl-overflow-trunc branch from c272836 to ce2a808 Compare September 25, 2024 19:41

use no_sanitize and sanitize over allow and skip/forbid

7f92e75

Signed-off-by: Justin Stitt <[email protected]>

This comment was marked as outdated.

Sign in to view

kees requested a review from melver October 30, 2024 17:43

rephrase some documentation for readability's sake

f119326

Signed-off-by: Justin Stitt <[email protected]>

melver approved these changes Oct 30, 2024

View reviewed changes

clang/docs/SanitizerSpecialCaseList.rst Outdated Show resolved Hide resolved

bwendling reviewed Oct 30, 2024

View reviewed changes

clang/lib/CodeGen/CGExprScalar.cpp Show resolved Hide resolved

vitalybuka reviewed Oct 30, 2024

View reviewed changes

clang/test/CodeGen/ubsan-type-ignorelist.test Outdated Show resolved Hide resolved

vitalybuka reviewed Oct 30, 2024

View reviewed changes

clang/docs/SanitizerSpecialCaseList.rst Outdated Show resolved Hide resolved

vitalybuka reviewed Oct 30, 2024

View reviewed changes

clang/test/CodeGen/ubsan-type-ignorelist.cpp Outdated Show resolved Hide resolved

vitalybuka requested changes Oct 30, 2024

View reviewed changes

vitalybuka approved these changes Oct 30, 2024

View reviewed changes

vitalybuka reviewed Oct 31, 2024

View reviewed changes

clang/lib/AST/ASTContext.cpp Outdated Show resolved Hide resolved

fix improper link formatting

d670482

Signed-off-by: Justin Stitt <[email protected]>

vitalybuka reviewed Oct 31, 2024

View reviewed changes

clang/lib/AST/ASTContext.cpp Outdated Show resolved Hide resolved

JustinStitt added 4 commits October 30, 2024 21:21

readd removed test

8e47307

Signed-off-by: Justin Stitt <[email protected]>

add new test for category precedence, rename pre-existing test

33aa461

Signed-off-by: Justin Stitt <[email protected]>

add clarification about category reordering to docs

4eba983

Signed-off-by: Justin Stitt <[email protected]>

use better comparison logic

651bc7d

Signed-off-by: Justin Stitt <[email protected]>

use printing policy for type name

7156afb

Signed-off-by: Justin Stitt <[email protected]>

JustinStitt requested a review from vitalybuka October 31, 2024 05:05

This comment was marked as resolved.

Sign in to view

vitalybuka reviewed Nov 1, 2024

View reviewed changes

clang/docs/SanitizerSpecialCaseList.rst Outdated Show resolved Hide resolved

remove no_sanitize category

b4a9448

Signed-off-by: Justin Stitt <[email protected]>

vitalybuka merged commit 3dd1d88 into llvm:main Nov 4, 2024
9 checks passed

vitalybuka mentioned this pull request Nov 4, 2024

[ubsan] Suppression by type for -fsanitize=enum #114754

Merged

vitalybuka added a commit that referenced this pull request Nov 5, 2024

[ubsan] Suppression by type for -fsanitize=enum (#114754)

21d25d2

Similar to #107332.

JustinStitt mentioned this pull request Nov 5, 2024

[Clang] add wraps and no_wraps attributes #115094

Open

PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024

[ubsan] Suppression by type for -fsanitize=enum (llvm#114754)

83b6abc

Similar to llvm#107332.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs #107332

[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs #107332

JustinStitt commented Sep 4, 2024 •

edited

Loading

llvmbot commented Sep 4, 2024 •

edited

Loading

Summary

Usefulness

Other ways to accomplish the goal of sanitizer allowlisting/whitelisting

CCs

This comment was marked as outdated.

melver commented Oct 30, 2024

JustinStitt commented Oct 30, 2024

melver commented Oct 30, 2024

vitalybuka commented Oct 30, 2024 •

edited

Loading

vitalybuka commented Oct 30, 2024

vitalybuka left a comment

vitalybuka left a comment

JustinStitt commented Oct 31, 2024

This comment was marked as resolved.

This comment was marked as resolved.

vitalybuka commented Nov 1, 2024

JustinStitt commented Nov 1, 2024

kees commented Nov 1, 2024

vitalybuka commented Nov 1, 2024 •

edited

Loading

vitalybuka commented Nov 4, 2024

vitalybuka commented Nov 4, 2024

llvm-ci commented Nov 4, 2024

JustinStitt commented Nov 4, 2024

[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs #107332

[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs #107332

Conversation

JustinStitt commented Sep 4, 2024 • edited Loading

Summary

Usefulness

Other ways to accomplish the goal of sanitizer allowlisting/whitelisting

CCs

llvmbot commented Sep 4, 2024 • edited Loading

Summary

Usefulness

Other ways to accomplish the goal of sanitizer allowlisting/whitelisting

CCs

This comment was marked as outdated.

melver commented Oct 30, 2024

JustinStitt commented Oct 30, 2024

melver commented Oct 30, 2024

vitalybuka commented Oct 30, 2024 • edited Loading

vitalybuka commented Oct 30, 2024

vitalybuka left a comment

Choose a reason for hiding this comment

vitalybuka left a comment

Choose a reason for hiding this comment

JustinStitt commented Oct 31, 2024

This comment was marked as resolved.

This comment was marked as resolved.

vitalybuka commented Nov 1, 2024

JustinStitt commented Nov 1, 2024

kees commented Nov 1, 2024

vitalybuka commented Nov 1, 2024 • edited Loading

vitalybuka commented Nov 4, 2024

vitalybuka commented Nov 4, 2024

llvm-ci commented Nov 4, 2024

JustinStitt commented Nov 4, 2024

JustinStitt commented Sep 4, 2024 •

edited

Loading

llvmbot commented Sep 4, 2024 •

edited

Loading

vitalybuka commented Oct 30, 2024 •

edited

Loading

vitalybuka commented Nov 1, 2024 •

edited

Loading