OOPSLA 2020

A Structural Model for Contextual Code Changes

Shaked Brody, Uri Alon, Eran Yahav

TL;DR — When developers make an edit in one location, related edits often need to happen elsewhere. The structural model learns to predict these correlated edits from the AST structure, auto-completing multi-location changes.

The Problem

Code edits rarely happen in isolation. Renaming a method requires updating all call sites. Changing a parameter type requires updating all callers. Adding a field to a class means every constructor and serialization routine must be updated too.

Developers frequently miss some of these correlated edits, introducing bugs that can be subtle and hard to track down. An IDE might catch a compilation error if you change a type, but many correlated edits involve semantically related changes that the compiler cannot flag — for example, updating documentation, adjusting default values, or modifying related logic in a different function.

The core challenge: given an initial edit, can we automatically predict where else in the code additional edits are needed, and what those edits should be?

The Key Idea

Rather than treating code as flat text, the structural model represents edits as tree transformations on the Abstract Syntax Tree (AST). The AST captures the structural relationships between code elements — a method declaration is connected to its parameters, its body, and the call sites that invoke it.

Given an initial edit at one AST node, the model uses AST paths between the edited location and every other node in the tree to predict:

The AST paths encode the structural relationship between two locations. For example, the path from a method declaration to one of its call sites traverses up through the class, across to another method, and down to the invocation. These structural paths carry far more predictive signal than token distance in the raw text.

Interactive Demo

Edit Completion Explorer

Before

After

AST Structure & Edit Paths

Initial edit
Predicted edit
AST path

Click "Predict Edits" to see the model trace AST paths from the initial edit and predict correlated changes.

How It Works

The model operates in three stages:

1. Edit Encoding

The initial edit is represented as a change to an AST node — a subtree deletion, insertion, or replacement. The encoder captures both the content of the edit (what changed) and the context (the surrounding tree structure).

2. Path-based Attention

For every other node in the AST, the model computes the structural path from the initial edit location. These paths are decomposed into sequences of up, down, and sibling moves through the tree. The model attends over these paths to determine which nodes are likely to require correlated edits.

Path: MethodDecl ↑ ClassBody ↑ ClassDecl ↓ ClassBody ↓ MethodDecl ↓ MethodBody ↓ ExprStmt ↓ MethodCall Interpretation: From the edited method, go up to the class, across to another method, and down to a call site of the original method.

3. Edit Prediction

For each candidate location identified by the attention mechanism, the decoder generates the predicted edit. It produces a new subtree to replace the existing node, conditioned on both the original edit and the structural path.

Key insight: the AST path between two locations is a compact, expressive representation of their structural relationship. Two call sites of the same method will share similar path patterns, even if they are far apart in the source file.

Results

The structural model significantly outperforms text-based approaches that treat code as flat token sequences. By leveraging the tree structure of the AST, the model achieves:

The approach was evaluated on large-scale datasets of real code changes, demonstrating that structure-aware models consistently outperform sequence-based baselines on the task of predicting correlated edits.

@inproceedings{brody2020neural, title={A Structural Model for Contextual Code Changes}, author={Brody, Shaked and Alon, Uri and Yahav, Eran}, booktitle={Proceedings of the ACM on Programming Languages (OOPSLA)}, year={2020}, publisher={ACM} }