Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions address-sanitizer.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
This document describes how Address Sanitizer is integrated with ucblogo.

It has become best-practice to use such tools to verify C/C++ programs to help
ensure memory-safety, security, stability, rapid dev/text/debug cycles, etc.
That or don't use C.

Address Sanitizer (ASAN) is a runtime library developed by Google and integrated
into GCC (and clang) which can detect various types of subtle C memory
management bugs.

ASAN instruments code by rewriting it and managing variable allocation itself.
In order to check for stack related errors, it creates it's own "fake" stack on
the heap and allocates most (but not all) stack variables on the fake stack.

This is transparent to pure ANSI-C code that makes no platform-specific
assumptions. The stack is undefined behavior as far as C is concerned.

Unfortunately every practical mark and sweep memory manager for C makes platform
specific assumptions about the stack, as they must check it for accessible
memory objects. Fortunately this is a reasonable assumption for modern desktop
systems, which invariable feature a contiguously addressed stack on which
automatic variable are allocated, and, with some hacks, we can get the extents
and iterate it.

Unfortunately ASAN breaks this assumption by allocating stack frames on the heap
and putting pointers to them on the real stack, which adds a level of
indirection.

Of course Google would like to use ASAN to check Chrome, probably why they wrote
it, and Chrome uses a typical M&S GC with stack inspection. So they added an API
to ASAN to smooth over the differences with minor code changes. As the stack is
marked an ASAN function checks for pointers to fake stack frames and if found
returns the extents of the fake frame extents to also be marked.

ASAN also provides an API for manually poisoning memory regions, like free NODE
objects.

In our project most of this happens in mem.c, except for getting the address of
the bottom of the stack which is in main.c as always.

ASAN adds overhead resulting in about a 2x slowdown so it's not suitable for
production builds. Better performance can be obtained by enabling compiler
optimizations.

To build ucblogo with ASAN pass the flags in CFLAGS to the configure script, eg.

CFLAGS="-O2 -g3 -fsanitize=address -static-libasan -fno-omit-frame-pointer" CXXFLAGS="-O2 -g3 -fsanitize=address -static-libasan -fno-omit-frame-pointer" ./configure --prefix=$HOME --enable-objects

then make and test as normal.

To make GDB stop before exiting when ASAN hits a bug, set a breakpoint on the
ASAN Die function

break __sanitizer::Die

You may also find the following GDB breakpoints useful

break err_logo
dprintf mem.c:872,"free %V %V\n",nd,*nd
dprintf mem.c:299,"newnode %V %V\n",newnd,*newnd
break mem.c:299
commands
bt
c
end

This will report where and when nodes are allocated and free-ed. It is especialy
useful in conjunction with -DSERIALIZE_OBJECTS so unique objects can be tracked
through their complete lifecycle.

More useful debugging tools are given in gdb.rc
52 changes: 52 additions & 0 deletions gdb.rc
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Dump a list of all nodes in all segments.
# Highly recommended to pipe output to a file
# pipe dumpnodes | cat > nodes.txt
define dumpnodes
set $seg=segment_list
while $seg
set $noden=0
while $noden < $seg->size
set $node = $seg->nodes + $noden
printf "node %V %V %V %V\n",$node->id, $node->node_type, $node, *$node
set $noden=$noden+1
end
set $seg=$seg->next
end
end

# Given an address, if it is a pointer to a fake stack variable, returns the
# pointer to the real stack frame.
# Otherwise returns NULL.
# Mostly useful for checking if a variable is on the ASAN fake stack or not.
define realstack
p (void*)__asan_addr_is_in_fake_stack( \
(void*)__asan_get_current_fake_stack(), \
$arg0, \
NULL, NULL \
)
end

set $NT_LIST=010000
set $NT_TREE=0100000
set $NT_AGGR=020000
set $NT_EMPTY=040000
set $NT_CASEOBJ=000001
Comment on lines +29 to +33
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, these might be redundant since this is now part of the NodeTypes enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cut and pasted directly from the enum. They are redundant, but because these symbols are not defined in every scope GDB might break in, if the program is even still running (eg after a segfault), I came up with this.

There could be a better way, IDK. Probably not worth more effort unless you mean to use them.


# Walk and print a NODE tree.
define pcons
if $arg0
printf "node %14.d %V %V %V\n",$arg0->id,$arg0->node_type, $arg0,*$arg0
if $arg0->node_type != -1 && !($arg0->node_type & $NT_EMPTY)
if $arg0->node_type & $NT_TREE
pcons $arg0->nunion->ncons->nobj
end
if $arg0->node_type & $NT_LIST || $arg0->node_type & $NT_AGGR
pcons $arg0->nunion->ncons->ncar
pcons $arg0->nunion->ncons->ncdr
end
if $arg0->node_type & $NT_CASEOBJ
pcons $arg0->nunion->ncons->ncar
end
end
end
end
5 changes: 4 additions & 1 deletion globals.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ extern int main(int, char *[]);
extern void unblock_input(void);
extern void delayed_int(void);
extern NODE *command_line;
#ifdef SERIALIZE_OBJECTS
extern unsigned long long int next_node_id;
#endif

#if defined(SIG_TAKES_ARG)
void logo_stop(int);
Expand Down Expand Up @@ -673,4 +676,4 @@ extern NODE *parent_list(NODE *);

extern void dbUsual(const char*);

#endif
#endif
7 changes: 5 additions & 2 deletions intern.c
Original file line number Diff line number Diff line change
Expand Up @@ -55,11 +55,14 @@ FIXNUM hash(char *s, int len) {
}

NODE *make_case(NODE *casestrnd, NODE *obj) {
NODE *new_caseobj, *clistptr;
NODE *new_caseobj, *clistptr, *tmp;

tmp = cons(NIL, NIL);
clistptr = caselistptr__object(obj);
new_caseobj = make_caseobj(casestrnd, obj);
setcdr(clistptr, cons(new_caseobj, cdr(clistptr)));
setcar(tmp, new_caseobj);
setcdr(tmp, cdr(clistptr));
setcdr(clistptr, tmp);
return(new_caseobj);
}

Expand Down
3 changes: 1 addition & 2 deletions lists.c
Original file line number Diff line number Diff line change
Expand Up @@ -421,11 +421,10 @@ NODE *larrayp(NODE *arg) {
}

NODE *memberp_help(NODE *args, BOOLEAN notp, BOOLEAN substr) {
NODE *obj1, *obj2, *val;
NODE *obj1, *obj2;
int leng;
int caseig = varTrue(Caseignoredp);

val = FalseName();
obj1 = car(args);
obj2 = cadr(args);
if (is_list(obj2)) {
Expand Down
14 changes: 14 additions & 0 deletions logo.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,13 @@
#include "config.h"
#endif

// Address Sanitizer check
#if defined(__has_feature)
# if __has_feature(address_sanitizer) // for clang
# define __SANITIZE_ADDRESS__ // GCC already sets this
# endif
#endif

/* #define OBJECTS */

/* #define MEM_DEBUG */
Expand Down Expand Up @@ -237,9 +244,16 @@ struct string_block {
#define incstrrefcnt(sh) (((sh)->str_refcnt)++)
#define decstrrefcnt(sh) (--((sh)->str_refcnt))

// Assign a unique serial number to each object allocated so that it can be
// tracked through it's complete lifecycle.
// #define SERIALIZE_OBJECTS

typedef struct logo_node NODE;
typedef struct logo_node {
NODETYPES node_type;
#ifdef SERIALIZE_OBJECTS
unsigned long long int id;
#endif
int my_gen; /* Nodes's Generation */ /*GC*/
int gen_age; /* How many times to GC at this generation */
long int mark_gc; /* when marked */
Expand Down
2 changes: 1 addition & 1 deletion logodata.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ int ecma_get(int ch) {
#endif

char *strnzcpy(char *s1, char *s2, int n) {
strncpy(s1, s2, n);
strncpy(s1, s2, n + 1);
s1[n] = '\0';
return(s1);
}
Expand Down
33 changes: 32 additions & 1 deletion main.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@
#include "logo.h"
#include "globals.h"

#ifdef __SANITIZE_ADDRESS__
#include <sanitizer/asan_interface.h>
#endif

#ifdef HAVE_TERMIO_H
#ifdef HAVE_WX
#include <termios.h>
Expand Down Expand Up @@ -193,6 +197,31 @@ void delayed_int() {
#endif
}

void set_bottom_stack( NODE** bottom) {
#ifdef __SANITIZE_ADDRESS__
// ASAN does unholy things
void** real_ptr;
// void* fake_stack =
// Theoretically ASAN can be configured not do stack checks, so check
// if we're using a fake stack right now.
// if (fake_stack) {
// If the stack variable is in the fake stack, real_ptr will contain
// the real stack address of the fake stack frame pointer.
// That's the address of the bottom of the real stack.
real_ptr = __asan_addr_is_in_fake_stack(
__asan_get_current_fake_stack(),
bottom,
NULL, NULL
);
// Otherwise the variable is on the real stack so treat it normally.
// }
// bottom_stack = fake_stack && real_ptr ? real_ptr : &bottom;
bottom_stack = real_ptr ? real_ptr : bottom;
#else
bottom_stack = bottom;
#endif
}

#ifdef HAVE_WX
extern char * wx_get_original_dir_name(void);
extern char * wx_get_current_dir_name(void);
Expand All @@ -204,11 +233,13 @@ int start (int argc,char ** argv) {
int main(int argc, char *argv[]) {
#endif
NODE *exec_list = NIL;

set_bottom_stack(&exec_list); /* GC */

NODE *cl_tail = NIL;
int argc2;
char **argv2;

bottom_stack = &exec_list; /*GC*/

#ifndef HAVE_WX
#ifdef x_window
Expand Down
10 changes: 6 additions & 4 deletions makehelp.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@
#include <stdlib.h>
#include <string.h>

char line[100], line2[100], line3[100];
char name[30] = "helpfiles/";
char name2[30] = "helpfiles/";
char tocname[20], tocname2[20];
#define BUFSIZE (1024)

char line[BUFSIZE], line2[BUFSIZE], line3[BUFSIZE];
char name[BUFSIZE] = "helpfiles/";
char name2[BUFSIZE] = "helpfiles/";
char tocname[BUFSIZE], tocname2[BUFSIZE];

int main(int argc, char **argv) {
FILE *in=fopen("usermanual", "r");
Expand Down
Loading