Design

This section presents the high-level design of the Static Code Analyzer program and its plug-in architecture.

Architecture

The Static Code Analyzer engine's architecture is based on the concept of an Evaluator that performs some sort of analysis and returns a list of Report objects to communicate its findings (see the image below). Reports are grouped per input file in a FileReports object, a list of which is returned by the engine to the caller.

Class diagram

An Evaluator can take one of two forms:

A Rule, which operates independently on some source;
A Proxy, which groups Rule objects together to operate sequentially on the same source.

Proxies are useful for two reasons:

They avoid expensive re-opening of source files when they are to be evaluated using multiple rules; and
They enable multi-stage evaluations where one rule depends on the results or state of a previous rule (an example is provided below).

The base pscodeanalyzer package includes a single concrete Evaluator, RegexRule, which allows for the dynamic definition of rules based on regular expressions in the configuration file.

This architecture enables the use of plug-ins which extend the Evaluator class and its Rule and Proxy subclasses, in order to perform analyses for specific languages or use-cases. The Static Code Analyzer, as delivered, includes a set of such plug-ins specific to PeopleCode analysis, as described in the following section.

Delivered PeopleCode Plug-Ins

The image below shows the architecture of the pscodeanalyzer.rules.peoplesoft module. Note its dependency on the peoplecodeparser package, as well as the ANTLR 4 Python 3 runtime.

Class diagram

The classes within the pscodeanalyzer.rules.peoplesoft module have been designed to perform the following two evaluations:

Ensure that no SQLExec statements use a string literal or compound expression as the first argument (i.e., SQL definition references and isolated variables are allowed); and
Ensure that no PeopleCode programs (other than Application Classes) reference undeclared variables.

The first evaluation is implemented by the SQLExecRule class, and is relatively straightforward. The second evaluation, however, illustrates the use of proxies as a mechanism to enable multi-stage evaluations. This is explained in the following section.

These classes—along with the facilities provided by the peoplecodeparser package—can be used to define any number of rules which analyze quality-related issues in PeopleCode programs and Application Classes. A few examples might include:

Enforcing the use of GetSQL with a SQL definition as opposed to CreateSQL with a string literal;
Enforcing variable/property and function/method naming conventions;
Ensuring that Application Classes have adequate and accurate API comments;
Warning about legibility issues stemming from excessive nesting of function calls;
Warning about excessively long code blocks which could benefit from modularization.

Multi-Stage Undeclared Variable Evaluation

Symbol definition and referencing is a well-defined process in compiler design, a simplified version of which has been implemented in the pscodeanalyzer.rules.peoplesoft module. This simplification was possible for the following reasons:

It is assumed that only valid PeopleCode programs (i.e., those which compile) will be analyzed;
The PeopleCode compiler only allows for the implicit declaration of variables in event handler programs (as opposed to Application Classes);
We are therefore only concerned with the declaration of variable names and function arguments, but not with function names.

The process works in two phases:

The definition phase, implemented in the SymbolDefinitionPhaseRule class, analyzes the entire PeopleCode program and annotates itself with a structure of logically nested Scope objects, each containing the Symbol objects declared within its scope;
The reference phase, implemented in the SymbolReferencePhaseRule class, analyzes the entire PeopleCode program once again and validates that:
1. Each variable reference has been defined within the current scope or any of the enclosing parent scopes; and
2. Said definition occurred prior to its being referenced.

These two phases cannot operate independently, so they must both be configured as subsequent rules contained within the same PeopleCodeParserProxy object, where the second phase must inherit the annotations of the first.