Discusses discipline-specific intentionality in agents, e.g., moral in philosophy, utility-maximizing in economics.
The Levels of Autonomy for AI Agents Working Paper discusses how different academic disciplines conceptualize intentionality in agents, which informs the design and understanding of AI agent behaviors. In philosophy, an agent’s actions are often guided by moral responsibilities, reflecting a form of intentionality rooted in ethical reasoning. In contrast, economics typically models agents as rational actors that maximize individual utility, emphasizing goal-directed behavior based on cost-benefit analysis. These discipline-specific perspectives contribute to a broader understanding of agency, which the paper defines as the capacity to formulate an intention—understood broadly as an impetus for action—and carry out that action, applicable to both human and artificial agents. While prior literature has sometimes used "agency" and "autonomy" interchangeably, the paper treats them as distinct but related concepts, with autonomy referring to the extent to which an agent operates without user involvement. This interdisciplinary view supports the development of AI agents whose behaviors can be calibrated according to domain-specific values and operational requirements.
This working paper tackles the conceptual ambiguity surrounding "autonomy" in artificial intelligence by proposing a framework grounded in discipline-specific intentionality. Rather than treating autonomy as a monolithic or binary attribute, the authors dissect how different fields—ranging from philosophy to economics—define the capacity for an agent to form and execute intentions. The material rigorously compares these varying definitions, highlighting that an agent considered autonomous in an economic context (e.g., one strictly maximizing a utility function) may lack the moral agency required for autonomy in philosophical or ethical frameworks.
The key contribution of this work is the delineation of autonomy into distinct levels based on the source and structure of an agent’s goals. By mapping the divergence between "utility-maximizing" agents common in game theory and "morally reasoning" agents discussed in ethics, the paper provides a taxonomy for categorizing AI systems beyond simple capability benchmarks. It argues that true autonomy involves not just the ability to execute tasks independently, but the specific nature of the decision-making architecture—whether it is driven by external reward functions, programmed constraints, or internalized value systems.
This research matters significantly for the current trajectory of AI development, particularly as large language models are increasingly deployed as autonomous agents in complex environments. For engineers and system architects, this framework offers a necessary vocabulary for specifying the "type" of autonomy an agent possesses, which is critical for safety, alignment, and predictability. As agents begin to operate with higher degrees of freedom, understanding the nuance between instrumental rationality and genuine agentic intentionality becomes essential for establishing proper governance and accountability protocols.
# Summary: Levels of Autonomy for AI Agents
This working paper explores the concept of autonomy in AI agents through the lens of discipline-specific intentionality, examining how different fields—such as philosophy, economics, and computer science—define and operationalize agency. The authors argue that autonomy is not a binary property but exists along a spectrum, shaped by the purposes and constraints imposed by the domain in which an AI operates. For example, in philosophy, autonomy is often tied to moral reasoning and the capacity for self-governance, while in economics, it is linked to utility maximization and decision-making under uncertainty. The paper synthesizes these perspectives to propose a multi-dimensional framework for assessing AI autonomy, highlighting how discipline-specific goals influence design choices, such as reward functions, planning horizons, and robustness constraints.
The paper’s key contribution lies in its taxonomy of autonomy levels, which moves beyond traditional distinctions (e.g., reactive vs. deliberative systems) to incorporate normative, strategic, and operational dimensions. It demonstrates how AI agents in high-stakes domains (e.g., healthcare, finance) may exhibit conditional autonomy—where decision-making authority is delegated or constrained based on predefined ethical, legal, or performance criteria. The authors also discuss the implications for alignment research, suggesting that discipline-specific intentionality can help clarify conflicts between human values and agent objectives. This work is significant because it bridges theoretical debates with practical AI design, offering a structured way to evaluate and mitigate risks associated with autonomous systems. For researchers and policymakers, it underscores the need for context-aware autonomy metrics to ensure AI systems are both effective and accountable within their intended applications.
Why it matters: As AI systems become more integrated into societal and organizational workflows, understanding their autonomy is critical for ensuring trust, safety, and ethical compliance. This paper provides a foundational framework for discussing autonomy in a way that is both rigorous and adaptable across disciplines, which is essential for advancing responsible AI development.