DeepFix: Automating C Code Repair with Deep Learning

Published on March 20, 2024 | Code Analysis

Picture wrestling with a C program, only to have the compiler throw a vague error message pointing to the wrong line. Infuriating, isn't it? Compilers catch mistakes but rarely help fix them. That's where DeepFix comes in. This deep learning solution automatically detects and corrects common C programming errors, acting like a coding mentor. Let's explore how DeepFix operates, its technical core, and why it's a breakthrough for coders.

Coding Slip-Ups: The Challenge

Writing C code is like assembling a puzzle: one missing piece, like a semicolon or brace, can break everything. These "common programming errors" - think undeclared variables, stray commas, or mismatched delimiters - are similar to typos in writing. They often arise from inexperience or momentary lapses. While compilers flag these issues, their error messages are often cryptic, misidentifying the problem's location. DeepFix offers a complete solution, using a neural network to pinpoint and patch these errors without external tools.

Inside DeepFix: How It Fixes Code

DeepFix approaches code repair as a sequence-to-sequence learning task, akin to translating text. It employs a multi-layered sequence-to-sequence neural network with attention to analyze a faulty C program and suggest fixes. Let's break it down using an example of a program with a missing brace triggering a compiler error.

Code as Sequences

DeepFix converts a C program into a token sequence, capturing keywords (e.g., int), special characters (e.g., ;), and identifiers. To streamline learning, it maps variables and function names to a fixed pool, preserving program semantics. Literals become type-specific tokens (e.g., 123 to NUM, "hello" to STR). Each line is represented as a pair: the line number and tokenized statement, ending with <eos>. A k-line program becomes: (l₁, s₁), (l₂, s₂), ..., (lₖ, sₖ), <eos>. This setup allows targeting fixes to specific lines.

Neural Network Design

DeepFix's core is a sequence-to-sequence model with attention, drawing from neural machine translation. Its components are:

Encoder: N stacked gated recurrent units (GRUs) process the tokenized program. For token xₜ, hidden states are computed:

hₜ⁽¹⁾ = GRU(hₜ₋₁⁽¹⁾, xₜ)
hₜ⁽ⁿ⁾ = GRU(hₜ₋₁⁽ⁿ⁾, hₜ⁽ⁿ⁻¹⁾), for n = 2, ..., N

These generate annotations encoding program context.

Decoder with Attention: N GRUs produce the fix. A context vector cₜ weights encoder annotations:

cₜ = Σⱼ (aₜⱼ · hⱼ⁽ᴺ⁾)
aₜⱼ = exp(eₜⱼ) / Σₖ exp(eₜₖ)
eₜⱼ = φ(dₜ₋₁, hⱼ⁽ᴺ⁾)

Here, φ is a feed-forward network for soft alignment, focusing on key input parts. The decoder outputs a fix as a line number lᵢ and corrected statement sᵢ'.

Training: Trained on error-fix pairs using the Adam optimizer and cross-entropy loss. Dropout (0.2) and gradient clipping prevent overfitting. Two networks are used: one for undeclared variables (using ID token) and another for other errors, boosting accuracy by up to 5%.

Fixing Errors Step-by-Step

Multiple errors in a program? DeepFix iterates. It predicts a fix (lᵢ, sᵢ'), updates the line, and recompiles. If errors persist, it continues until the program is error-free or no fixes are suggested. Correct programs trigger a "fixed" token to halt. This approach resolved 4447 errors in the first iteration and 919 more later, supporting up to five iterations.

Validation Check

DeepFix uses GCC only to verify fixes, ensuring they resolve errors without creating new ones. An oracle blocks invalid fixes, like deleting problematic lines.

DeepFix's Impact

DeepFix redefines automated code repair. Unlike methods tied to specific tasks or reliant on brute-force searches, it's task-agnostic, tackling errors across 93 varied C programming tasks, from basic arithmetic to dynamic programming. It resolved 32% of 16,743 error messages in 6,971 student programs, fully fixing 27% and partially fixing 19%, all in milliseconds.

In a seeded dataset (correct programs with up to five injected errors), DeepFix fully fixed 56% of 9,230 programs and resolved 63% of 31,783 errors. It correctly localized 78.68% of errors in the first iteration, with 87.5% in the top-5 predictions via beam search. The attention mechanism captures long-term dependencies, outperforming methods limited to local context.

Technical Foundations

DeepFix's key elements include:

Code Representation: Tokenizes programs with line numbers, standardizing identifiers and literals.
Neural Model: 4-layer GRUs with 300 cells, attention for long-distance dependencies, and a 129-token vocabulary in 50D embeddings.
Iterative Strategy: Addresses one error per iteration, stopping when fixed or stalled.
Training Data: Generated by mutating correct programs, creating error-fix pairs for up to five errors.
Speed: Fixes errors in tens of milliseconds, ideal for real-time use.

Limitations exist: complex fixes (e.g., array assignments needing loops) are out of reach, and synthetic training data misses some real-world errors, lowering raw dataset performance (27% vs. 56% complete fixes).

Looking Ahead

DeepFix highlights deep learning's potential to streamline coding by automating error repair. Its sequence-to-sequence approach with attention offers a scalable, language-agnostic framework. Future improvements could address complex fixes or longer sequences, but DeepFix already raises the bar. Next time a compiler error stumps you, picture DeepFix stepping in with a precise, automated solution.

Paper: Read the full paper here

Pratham Grover