This proposal introduces a new build action called "Scalable Analysis" in swift-build to perform whole-program analysis and subsequent code transformation.
Table of Contents
Context
Currently, the static analysis that is provided by clang is restricted to individual translation units (TUs). In swift-build, this is achieved via a build action called "Analyze". This build action invokes clang with additional flags to perform the requested analyses and produce warnings/errors per TU. For a lot of analyses, being constrained to a single TU makes them imprecise and incomplete because the compilation boundary introduces unknowns that usually lead to either false positives or false negatives. For example, we are working on a tool clang-reforge that replaces unsafe C++ raw pointer accesses with safe pointer abstractions that include bounds checking. To do this correctly, the tool needs to understand how a pointer defined in one TU is used across other TUs — information that is unavailable with a single-TU analysis. This cross-TU requirement means the unsafe buffer usage analysis cannot perform precise analysis or suggest meaningful code changes when restricted to a single TU.
Whole-program analysis is a capability that most static analysis tool developers could benefit from. We are creating a new framework in clang called the Scalable Static Analysis Framework (SSAF) to support summary-based whole-program analysis. In this approach, analysis proceeds in four steps. First, each TU is compiled to produce an analysis summary. Second, the per-TU summaries are linked together into a single Linked Unit (LU) summary. Third, whole-program analysis is performed on the LU summary. Fourth, the results of the whole-program analysis are propagated back to the individual TUs, enabling precise, actionable diagnostics and code transformations that account for the full program.
For correct analysis results we do need precise understanding of how the code is built so we can correctly relate entities across all input source code. Since the build system itself is the source of truth for this, providing build system support would ensure multiple analyses can easily invoke whole-program analysis. We propose a new Build action ("Scalable Analysis") that will invoke these tools with the right inputs.
Please refer to the SSAF RFC (RFC: Scalable Static Analysis Framework - Clang Frontend - LLVM Discussion Forums) for a deeper dive into SSAF. This write-up is meant to provide closer details of integrating the SSAF framework with the Swift Build system.
SSAF Architecture
The basic operations provided by the SSAF framework will be: TU summary extraction, TU summary linking, global summary analysis, and propagation of global summary analysis results to TUs. Its implementation will be divided among multiple command line tools, each tool implementing one of the operations. The analysis result obtained can be used by another tool such as a code-rewriting tool or a fixit suggestion tool, etc.
The framework implements a two-pass workflow:
The diagram above illustrates a two-pass workflow. In the first pass, each source file is compiled by clang to produce a per-TU summary. These summaries are then linked together by ssaf-linker into a single LU summary, on which ssaf-analyzer performs whole-program analysis to produce an analysis result. In the second pass, clang processes each source file again — this time informed by the analysis result — to generate a set of file edits. The src-edit-merge tool consolidates these per-file edits into a single merged edit.
Overview of the "Scalable Analysis" Build Action
The Swift Build system will orchestrate the invocation of individual tools with the right inputs. The "Scalable Analysis" build action will invoke these tools at appropriate stages (compilation, linking) during the build. This will implement almost the whole two-pass analysis, finishing with source edit merge and proposed source file changes. The build system will be invoked by clang-reforge tool (or other whole-program analysis tools based on SSAF).
Our initial milestone is to implement the build action that is capable of invoking these tools for a single build target (e.g., a project that compiles to a single shared library or an executable). We plan to develop support for complex build configurations later.
Implementation Details
The new build action will consist of a series of build tasks invoked at various stages of compiling an executable:
| Build Task | Description | Invokes tool | Input | Output |
|---|---|---|---|---|
| TU Summary Extractor | Invokes the build command with additional flags to clang for TU summary extraction. Depends on the build task that invokes clang. | clang |
Build command, output summary file path and name, list of summary analyses. | TU summary file |
| TU Summary Linker | Invokes ssaf-linker to generate a Linked Unit (LU) summary per analysis. Depends on the linker to determine which TU summaries to link. |
ssaf-linker |
TU summary files from the previous build task, path for output LU summary file. | LU summary file |
| Whole-Program Analyzer | Invokes ssaf-analyzer to perform whole-program analysis on the LU summary file. |
ssaf-analyzer |
LU summary file | Whole-program analysis result file |
| Per-TU Result Propagator | Invokes clang for each TU with additional flags to propagate whole-program analysis results back to individual TUs. Depends on the original clang build task. | clang |
Whole-program analysis result file | Set of proposed file edits |
| Merged Edit Generator | Invokes src-edit-merge to consolidate per-file edits. Depends on the linker to determine which TU summaries were linked. |
src-edit-merge |
Set of proposed file edits | Merged proposed file edits in a single file |
