Research Index

Papers on Agent Architectures

A categorized index of 227 research papers on agent architectures, execution loops, multi-agent coordination, communication protocols, and deployment patterns. Last updated March 2026.

227 Papers
12 Categories
A

Agent Architecture Surveys

17 papers
AI Agent Systems: Architectures, Applications, and Evaluation2026

Unified taxonomy: policy core, memory, planners, tool routers, critics. "Agent transformer" abstraction.

Agentic AI: Architectures, Taxonomies, and Evaluation of LLM Agents2026

Taxonomy: Perception, Brain, Planning, Action, Tool Use, Collaboration.

Agentic AI: Comprehensive Survey of Architectures, Applications, Future Directions2025

Dual paradigm: Symbolic/Classical vs Neural/Generative. PRISMA 90 studies.

The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling2024

Single vs multi-agent. Vertical vs horizontal. Planning/execution/reflection.

Agentic AI Frameworks: Architectures, Protocols, and Design Challenges2025

Protocol-focused: MCP, A2A, ACP, ANP, Agora.

B

Agent Execution Loops & Reasoning

19 papers
ReAct: Synergizing Reasoning and Acting in Language Models2022

The canonical pattern. 34% improvement on ALFWorld.

Reflexion: Language Agents with Verbal Reinforcement Learning2023

Self-reflection as verbal reinforcement.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models2023

Tree-based search over reasoning paths.

Language Agent Tree Search (LATS)2023

MCTS for LLM agents.

Self-Refine: Iterative Refinement with Self-Feedback2023

Iterative self-improvement within single generation.

C

Multi-Agent Coordination & Orchestration

18 papers
The Orchestration of Multi-Agent Systems: Architectures, Protocols, and Enterprise Adoption2026

Unified framework: planning, policy, state, quality. MCP + A2A.

Multi-Agent Collaboration via Evolving Orchestration2025

Centralized puppeteer orchestrator. Dynamic agent selection.

Towards a Science of Scaling Agent Systems2025

Quantitative scaling. Independent/decentralized/centralized/hybrid compared.

AdaptOrch: Task-Adaptive Multi-Agent Orchestration in the Era of LLM Performance Convergence2026

Topology selection as function of task dependency. References Claude Code Agent Teams.

AgentOrchestra: Hierarchical Multi-Agent Framework for General-Purpose Task Solving2025

Central planning agent + specialized sub-agents.

D

Agent Communication Protocols

7 papers
Model Context Protocol (MCP)2024

Agent-to-tool standard. Client-host-server. JSON-RPC 2.0.

Agent-to-Agent (A2A) Protocol2025

Peer coordination, negotiation, delegation. Agent cards.

A Survey of AI Agent Protocols2025

Classification of MCP, A2A, ACP, ANP.

Agent Interoperability Protocols Survey2025

MCP vs ACP vs A2A vs ANP.

Agora Protocol2025

Meta-coordination layer. Protocol Documents for protocol selection.

E

Self-Improving & Self-Evolving Agents

14 papers
EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle2025

Offline Self-Distillation -> Online Interaction.

SAGE: RL for Self-Improving Agent with Skill Library2025

Sequential Rollout. Skill-integrated Reward. 8.9% improvement.

Truly Self-Improving Agents Require Intrinsic Metacognitive Learning2025

Extrinsic vs intrinsic metacognition.

Building Self-Evolving Agents via Experience-Driven Lifelong Learning2025

Four principles: exploration, memory, skill transfer, planning.

Self-Evolving AI Agents Survey: Path to ASI2025

Comprehensive taxonomy. Intra vs inter-test-time learning.

F

Agent Memory Systems

21 papers
Hindsight: Building Agent Memory that Retains, Recalls, and Reflects2026

Four networks, three operations. 91.4% on LongMemEval.

MAGMA: Multi-Graph Agentic Memory Architecture2026

Semantic, temporal, causal, entity graphs. Policy-guided traversal.

Multi-Agent Memory as Computer Architecture2026

Shared vs distributed. Three-layer hierarchy. Two protocol gaps.

Anatomy of Agentic Memory: Taxonomy and Empirical Analysis2026

Comprehensive taxonomy. Evaluation limitations.

D-MEM: Biologically Inspired Architecture2026

Critic Router. Reward Prediction Error gating. SKIP/CONSTRUCT/EVOLVE.

G

Agent Tool Use

30 papers
Augmented Language Models: A Survey2023

Comprehensive tool-augmented LLM survey.

ToolLLM: Facilitating LLMs to Master 16000+ Real-World APIs2023

Large-scale tool use benchmark.

Tool Learning with LLMs: A Survey2025

Updated tool learning survey.

MCP-Bench: Benchmarking Tool-Using LLM Agents2025

MCP-based tool use benchmark.

BFCL v3: Multi-Turn API Workflows2025

AST-analysis for function calling validation.

H

Agent Evaluation & Benchmarks

24 papers
Evaluation and Benchmarking of LLM Agents: A Survey2025

Taxonomy: what to evaluate + how to evaluate.

Survey on Evaluation of LLM-based Agents2025

Comprehensive evaluation methods survey.

Beyond Task Completion: Assessment Framework for Evaluating Agentic AI2025

Four pillars: LLM, Memory, Tools, Environment.

AgentArch: Comprehensive Benchmark for Enterprise Agent Architectures2025

Enterprise-focused. Orchestration + memory + tool interaction.

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?2023

Software engineering benchmark.

I

Agent Safety & Security

27 papers
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents2024

110 harmful behaviors, 11 categories, 440 tasks.

Agent-SafetyBench: Evaluating Safety of LLM Agents2024

349 environments, 2000 test cases, 8 risk categories. None >60% safe.

Evolution of Agentic AI in Cybersecurity: From Single LLM to Autonomous Pipelines2025

Five-generation taxonomy.

AgentPoison: Red-Teaming LLM Agents via Poisoning Memory2024

Memory/knowledge base poisoning attacks.

BlockAgents: Byzantine-Robust LLM Coordination via Blockchain2024

Blockchain for coordination trust.

J

Enterprise & Production

14 papers
Context Engineering: From Prompts to Corporate Multi-Agent Architecture2026

Four-level maturity: Prompt -> Context -> Intent -> Specification.

A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows2025

End-to-end guide. MCP, orchestration, observability.

Evaluation-Driven Development of LLM Agents: Process Model and Reference Architecture2025

TDD/BDD-inspired continuous evaluation.

PwC Agent OS2025

Enterprise multi-agent coordination switchboard.

Accenture Trusted Agent Huddle2025

Cross-organizational governance.

K

Domain-Specific Agents

22 papers
From AI for Science to Agentic Science: Survey on Autonomous Scientific Discovery2025

Five pillars: planning, tools, memory, collaboration, evolution.

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery2024

End-to-end scientific research agent.

Autonomous Chemical Research with Large Language Models2023
Nature

Chemistry agent with real lab integration.

SWE-Agent: Agent-Computer Interfaces Enable Automated Software Engineering2024

SE-specific agent design.

Agentic AI Applied to Financial Services2025

Financial services agent crews.

L

Context Management & Compaction

14 papers
ACON: Natural-Language Compression Guidelines2025

54% context reduction without quality loss.

CISM: Compact Semantic Representations for Long-Horizon Execution2025

Condenses reasoning steps into compact form.

LLMLingua: Compressing Prompts for Accelerated Inference2023

Prompt compression preserving semantics.

Context Compression Strategies: OpenAI, Anthropic, and Factory Compared2026

36,000+ messages from real agentic sessions.

Context Engineering Pyramid: From Prompts to Corporate Architecture2026

Four-level maturity model. Context as engineered system.

Missing something?

Know a paper, tool, or repo that should be listed here? We want this index to be exhaustive.