FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation
Abstract
A unified agent system called FullStack-Agent is introduced to assist non-expert users in developing complex interactive websites by addressing full-stack development challenges through enhanced planning, code editing, and self-improving capabilities.
Assisting non-expert users to develop complex interactive websites has become a popular task for LLM-powered code agents. However, existing code agents tend to only generate frontend web pages, masking the lack of real full-stack data processing and storage with fancy visual effects. Notably, constructing production-level full-stack web applications is far more challenging than only generating frontend web pages, demanding careful control of data flow, comprehensive understanding of constantly updating packages and dependencies, and accurate localization of obscure bugs in the codebase. To address these difficulties, we introduce FullStack-Agent, a unified agent system for full-stack agentic coding that consists of three parts: (1) FullStack-Dev, a multi-agent framework with strong planning, code editing, codebase navigation, and bug localization abilities. (2) FullStack-Learn, an innovative data-scaling and self-improving method that back-translates crawled and synthesized website repositories to improve the backbone LLM of FullStack-Dev. (3) FullStack-Bench, a comprehensive benchmark that systematically tests the frontend, backend and database functionalities of the generated website. Our FullStack-Dev outperforms the previous state-of-the-art method by 8.7%, 38.2%, and 15.9% on the frontend, backend, and database test cases respectively. Additionally, FullStack-Learn raises the performance of a 30B model by 9.7%, 9.5%, and 2.8% on the three sets of test cases through self-improvement, demonstrating the effectiveness of our approach. The code is released at https://github.com/mnluzimu/FullStack-Agent.
Community
In this paper, we introduce FullStack-Agent, a unified system that combines a multi-agent full-stack development framework equipped with efficient coding and debugging tools (FullStack-Dev), an iterative self-improvement method that improves the abilities of LLMs through repository augmentation and back-translation (FullStack-Learn), and a full-stack development benchmark that comprehensively evaluates frontend, backend, and database functionalities (FullStack-Bench).
Extensive experiments demonstrate the effectiveness of our method. Testing FullStack-Dev with Qwen3-Coder-480B-A35B-Instruct as the backbone LLM on FullStack-Bench results in accuracies of 64.7%, 77.8%, and 77.9% in frontend, backend, and database test cases respectively, outperforming the previous state-of-the-art method by 8.7%, 38.2%, and 15.9%, respectively. Training Qwen3-Coder-30B-Instruct with FullStack-Learn improves its accuracies by 9.7%, 9.5%, and 2.8% in the three sets of test cases, respectively.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- InfiniteWeb: Scalable Web Environment Synthesis for GUI Agent Training (2026)
- Training Versatile Coding Agents in Synthetic Environments (2025)
- SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents (2026)
- ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development (2026)
- daVinci-Dev: Agent-native Mid-training for Software Engineering (2026)
- Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases (2025)
- ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper
