MonoScale: Scaling Multi-Agent System with Monotonic Improvement

Brave API

MonoScale proposes a framework for scaling multi-agent systems (MAS) that ensures monotonic improvement during the sequential integration of new agents, thereby preventing catastrophic performance drops often seen in naive scale-up approaches . The core challenge addressed is the router's cold start problem, where adding new, heterogeneous agents without sufficient knowledge of their capabilities leads to misrouting and system-wide performance collapse .

To mitigate this, MonoScale introduces an expansion-aware update protocol consisting of two stages: (i) agent-conditioned warm-up tasks that probe a new agent’s strengths, failure modes, and interface constraints, and (ii) evidence-driven updates to a natural-language routing memory that captures both successful and failed interactions . This memory is updated conservatively using trust-region constraints, ensuring that each expansion round does not degrade end-to-end performance .

Theoretically, the method formalizes sequential agent augmentation as a contextual bandit problem and proves a monotonic non-decreasing performance guarantee under conservative fallback mechanisms . Specifically, the router can fall back to using only previously validated agents during uncertain decisions, which provides a verifiable lower-bound safety property for open MAS expansion .

Experiments on benchmarks such as GAIA and Humanity’s Last Exam demonstrate that MonoScale achieves stable, sustained gains as the agent pool grows from 3 to 10, outperforming both naive scale-up and strong-router fixed-pool baselines . For instance, a Qwen-3-based router using MonoScale improves accuracy from 44.84% to 55.15% on GAIA, showing strict monotonic improvement . In contrast, strong models like Gemini-3-Pro exhibit non-monotonic performance, with accuracy dropping significantly when unreliable agents are introduced .

Moreover, MonoScale remains robust even in the presence of malfunctioning agents, encoding "negative constraints" into memory to isolate faulty components and prevent cascading failures in multi-step workflows . While current experiments focus on scaling up to 10 agents, the authors identify web-scale onboarding—managing thousands to millions of agents with budgeted routing calibration—as an open challenge for future work .

Generated Feb 22, 2026

Cerebras Thinking

MonoScale: Scaling Multi-Agent System with Monotonic Improvement addresses the fundamental instability often encountered when expanding Multi-Agent Systems (MAS). While increasing the number of agents generally aims to improve collective capability, existing scaling methods frequently suffer from non-monotonic performance curves, where adding agents can lead to coordination overhead, interference, and catastrophic drops in overall efficacy. This research introduces a framework designed to enforce a strict monotonic improvement guarantee, ensuring that as the system scales from a few agents to many, performance does not regress but instead maintains a consistent upward trajectory.

The key contribution of this work is the formulation of a scaling protocol that mathematically decouples agent addition from performance degradation. The authors propose mechanisms to identify and mitigate the negative externalities—such as redundant communication or conflicting sub-goals—that typically plague large-scale swarms. By constraining the scaling process to satisfy specific monotonicity criteria, MonoScale provides a robust method for integrating new agents into a collective without destabilizing the existing system. This approach shifts the paradigm from heuristic tuning of agent counts to a principled strategy that guarantees stability.

This research matters significantly for the practical deployment of autonomous systems, particularly in complex domains like software engineering, robotic swarms, and large language model orchestration. For practitioners, the assurance that scaling up a workforce of agents will not inadvertently break the system removes a major barrier to adoption. MonoScale effectively bridges the gap between the theoretical potential of massive agent collectives and the reliability required for production environments, enabling the development of AI systems that are both powerful and predictable.

Generated Mar 4, 2026

Open-Weights Reasoning

MonoScale: Scaling Multi-Agent Systems with Monotonic Improvement

This paper introduces MonoScale, a novel framework designed to address the challenges of scaling multi-agent systems (MAS) while ensuring monotonic performance improvement. A key issue in MAS scaling is the potential for catastrophic performance degradation when adding more agents—either due to increased interference, coordination overhead, or suboptimal decision-making. MonoScale mitigates this by leveraging a monotonicity guarantee, ensuring that performance metrics (e.g., reward, efficiency, or robustness) either improve or remain stable as the system scales. The approach combines learning-based and theoretical guarantees, using techniques such as curriculum learning, dynamic agent partitioning, and adaptive reward shaping to maintain scalability without sacrificing individual or collective performance.

The paper’s contributions include: 1. A formal definition of monotonic scaling in MAS, bridging gaps between decentralized control and multi-agent learning. 2. An algorithmic framework that dynamically adjusts agent interactions to preserve monotonicity, even under adversarial or non-stationary conditions. 3. Empirical validation across cooperative and competitive settings (e.g., robot swarms, game-theoretic scenarios), demonstrating consistent scaling benefits over baselines like independent learning or centralized coordination.

Why it matters: As MAS applications grow in complexity (e.g., autonomous fleets, AI-driven logistics), scaling without performance collapse is critical. MonoScale provides a principled way to ensure that adding more agents leads to predictable, incremental improvements, reducing the need for costly retraining or manual tuning. For researchers, it offers a new lens on scalable multi-agent learning, while for practitioners, it could enable more reliable deployment of large-scale decentralized systems. The work aligns with broader trends in safe and robust AI scaling, making it a valuable contribution to the field.

Generated Mar 12, 2026