Programming Not Only By Example

TL;DR: Programming by example (PBE) struggles when examples alone are ambiguous — many programs match the same examples. This paper adds user interaction: the synthesizer asks clarifying questions (showing candidate behaviors on new inputs) so the user can disambiguate without writing more examples.

The Problem

Programming by example is an appealing idea: you show a system a few input-output examples, and it writes a program for you. But there is a fundamental problem — examples are ambiguous.

Suppose you give the synthesizer 3 examples. It might find 100 programs that are all consistent with those examples, but most of them do the wrong thing on other inputs. The synthesizer picks one, and you have no idea if it picked the right one. You might not even notice the mistake until the program fails on new data.

The root cause is that input-output examples are a weak specification. They tell you what should happen on specific inputs, but they say nothing about the intent behind the transformation. Two very different programs can agree on your examples and disagree everywhere else.

The Key Idea

Instead of asking the user for more examples (which is hard — you would need to know which examples distinguish the candidates), the system takes the initiative. After synthesizing a set of candidate programs that all satisfy the user's examples, it:

Finds a distinguishing input — an input where the candidate programs produce different outputs.
Asks the user: "What should the output be for THIS input?"
Prunes candidates that disagree with the user's answer.
Repeats until only one correct program remains (or the candidates all agree).

This transforms synthesis from a one-shot guessing game into an interactive conversation. The system asks smart questions; the user just answers them. It typically takes only 2–3 questions to resolve all ambiguity.

Interactive Demo

Try the disambiguation process yourself. You have provided two examples for a string transformation task. The system has found 12 candidate programs that all match your examples — but they disagree on other inputs. Help the system find the right program by answering its questions.

Interactive Disambiguation

Your Examples

John Smith → J.S.

Alice Bob → A.B.

Candidate programs remaining: 12

Program A

First letter of each word, followed by a period

"Mary Jane Watson" → "M.J.W."

Program B

First letter of first and last word only, followed by a period

"Mary Jane Watson" → "M.W."

Program C

First letter of each word, period-separated, uppercase

"Mary Jane Watson" → "M.J.W."

Program D

First letter of first two words only, followed by a period

"Mary Jane Watson" → "M.J."

Program E

First letter of each capitalized word, followed by a period

"Mary Jane Watson" → "M.J.W."

+ 7 more candidates not shown (12 total)

Click an answer below to begin disambiguation.

How It Works

The disambiguation process has three main stages that form a loop:

Candidate Generation

Given the user's input-output examples, the synthesizer explores a space of programs (defined by a domain-specific language) and collects all programs that are consistent with the examples. This might yield dozens or hundreds of candidates.

Distinguishing Input Synthesis

The system searches for an input on which the candidate programs disagree — that is, they produce different outputs. A good distinguishing input maximizes the number of candidates it can eliminate. The paper formulates this as a search problem over the input space.

Interactive Pruning

The distinguishing input and the candidate outputs are shown to the user. The user selects the correct output, and all candidates that disagree are pruned. This loop repeats until the candidate set converges (all remaining candidates agree on all inputs, or a single candidate remains).

The key insight is that the system chooses which questions to ask, rather than burdening the user with inventing new examples. This makes the interaction lightweight: users only need to recognize the correct output, not construct it from scratch.

Results

The approach was evaluated on string-manipulation benchmarks commonly used in PBE research. Key findings:

Ambiguity is resolved with an average of 2–3 questions per task.
Each question typically eliminates a large fraction of the remaining candidates, leading to rapid convergence.
The interaction model is natural: users find it easier to pick the right answer from options than to come up with new examples.
The approach is complementary to existing PBE synthesizers — it wraps around them as a disambiguation layer, improving any underlying synthesis engine.

@inproceedings{peleg2018pnoe, title={Programming Not Only by Example}, author={Peleg, Hila and Shoham, Sharon and Yahav, Eran}, booktitle={Proceedings of the 40th International Conference on Software Engineering (ICSE)}, year={2018} }