Overview
The competition on Intelligent Coding Assistant Enhanced with Memory at FSE-AIWare 2026 invites participants to design and evaluate AI coding assistants with persistent memory across development sessions.
Motivation
Recent advances in large language models and coding agents have led to powerful tools for programming. Nevertheless, most systems operate in a largely stateless manner, with only short-term conversational context. As a result, they struggle to accumulate knowledge about user preferences, development practices, evolving codebases, and long-term objectives.
Memory is increasingly recognized as missing infrastructure for transforming coding agents from reactive tools into persistent collaborators.
Goals
The competition is centered on the core scientific question: How does persistent memory change the effectiveness and efficiency of AI coding assistants in long-term software development workflows?
The primary goal of this competition is to establish a shared experimental and conceptual framework for studying memory in coding assistants. By having participants design and evaluate memory-augmented systems, the competition seeks to foster innovation in:
- Memory architectures
- Retrieval mechanisms
- Integration strategies
- Rigorous empirical evaluation
Beyond technical novelty, the competition emphasizes cost-efficiency and reproducibility.
Join Our Community
If you are interested in participating in the competition, please visit our HotCRP site. Please register your interest as soon as possible, so that you can get up-to-date announcements. The Paper Submission and Artifacts fields can be omitted until the final submission deadline.
You can also join our Discord server where you can:
- Ask questions and get support
- Discuss with organizers and other participants
- Receive important announcements
Participation
Registration
To register your interest in participating in the competition, please visit our HotCRP site. Please register your interest as soon as possible, so that you can get up-to-date announcements. The Paper Submission and Artifacts fields can be omitted until the final submission deadline.
You can also join our Discord server where you can get technical support, discuss with the organization committee and receive important announcements.
Task
Details are subject to change. If you have concerns or suggestions, feel free to discuss them with the organizers in the Discord channel.
Each team is required to implement a memory-enhanced agent system as a Docker image. The memory component may be implemented using any suitable approach.
Benchmark
We will evaluate the system on a subset of the SWE-Bench Pro benchmark, containing 200 instances: 8 projects x 25 instances per project. We will evaluate each instance in a project sequentially, following the order of commit time.
Available Resources
Your agent has access to the following resources.
- The natural language description of the issue (the repo, repo_language, problem_statement, requirements, and interface fields in the SWE-Bench Pro benchmark)
- A Docker container of the checked-out repository, connectable via SSH
- An OpenAI-compatible LLM service to call various LLMs
- A writable shared directory, mounted to your Docker container, so that the system can persist memories across all instances within the same project
Available LLMs
Your agent system is prohibited from using the Internet or GPU. It can only access the LLMs provided by our service:
| Model | Cached Input ($/M) | Input ($/M) | Output ($/M) |
|---|---|---|---|
| gpt-5.2 | 0.175 | 1.75 | 14 |
| gemini-2.5-pro | 0.125 | 1.25 | 10 |
| glm-4.7 | 0.11 | 0.6 | 2.2 |
| deepseek-v3.2 | 0.028 | 0.28 | 0.42 |
The agent is granted a global quota of $200 across all 200 instances, capped at $2 for each individual instance. LLM calls will fail if the system reaches either quota. You can query the remaining quota with APIs.
Evaluation Criteria
We will rank each team based on several complementary criteria:
- Task Success Rate - Percentage of successfully completed tasks
- Efficiency - Token usage and turns
- Memory Utility - Difference in task success rate with or without memories
- Technical Contribution - As voted by the competition jury
Selected teams will be invited to present their work during a dedicated session at FSE-AIWare 2026.
Submission
Paper Submission
Each participating team must submit a short paper describing:
- Overall system architecture
- Design of the memory mechanism
- Policies for memory update and retrieval
- Integration of memory into the assistant
- Empirical evaluation demonstrating the impact of memory
Papers should have at most four pages (excluding the Reference section), and follow the standard FSE Companion proceedings format (double column). Reviewing will be single-blind. It is allowed if the main approach presented in the paper is published before or under review elsewhere.
System Submission
Each participating team must provide a Docker image of their system, which will be automatically evaluated by the organizers.
The technical specification for the system implementation is available at GitHub.
Proceedings
The competition proceedings are planned for inclusion in the ACM Digital Library. At least one author of each accepted paper must register for the conference in order for the paper to appear in the proceedings.
On-Site Event
The on-site event, planned as a half-day session at FSE-AIWare 2026, will include:
- System demonstrations
- Technical presentations
- Panel discussion on the role of memory in AI coding assistants
- Award ceremony recognizing outstanding technical contributions and societal impact
Selected teams will be invited to present their systems and results during the dedicated session. Attendance at the on-site competition session will be open to all conference participants.
Organization
Organization Committee
Judges
Yuan-An Xiao
Peking University
Beijing, China
Zhipeng Peng
Beihang University
Beijing, China