Pålitligt samarbete mellan flera LLM-agenter

Diarienummer: STP25-0161
Projektledare: Raza, Shahid
Start- och slutdatum: 260801-310731
Beviljat belopp: 5 000 000 kr
Förvaltande organisation: RISE Research Institutes of Sweden, Stockholm
Forskningsområde: Informations-, kommunikations- och systemteknik

Summary

Även fast multiagent-LLM-system har visat sig lovande när det gäller komplexa resonemang och planering, är dagens metoder i stort heuristiska, svåra att utvärdera systematiskt och otillräckliga när det gäller säkerhet, robusthet och integritet. Detta projekt syftar till att skapa en grund för tillförlitligt multiagent-samarbete genom att utforma system där agenterna kan samarbeta effektivt medan de är säkra, integritetsskyddade och kan hållas ansvariga. Den operativa planen omfattar fem steg: a) etablera ett reproducerbart experimentellt ramverk för multiagent-samarbete, b) främja inlärningsbaserad samordning, c) möjliggöra integritetsskyddat samarbete utan att skada användbarheten, d) införliva försvar mot förgiftnings- och verktygskedjeattacker, och e) bygga skalbara, verifierbara multiagent-ekosystem som stöder revision, ansvarsskyldighet och robusthet. Förväntade resultat inkluderar öppna och reproducerbara riktmärken för multi-LLM-samarbete, algoritmer och ramverk för begränsningsmedveten och integritetsskyddande koordination, systematiska utvärderingsmetoder för attack-försvar och skalbara agentarkitekturer med verifierbara beslutsregister. Projektets bredare inverkan är att definiera en ny standard för pålitliga multi-LLM-agentsystem. Genom att förena inlärning, planering, säkerhet och integritet i ett enda, reproducerbart ramverk kommer projektet att bidra med grundläggande metoder som är tillämpliga på avancerade AI-system som används i kritiska och komplexa miljöer.

Populärvetenskaplig beskrivning

Scientists are currently exploring how groups of smart computer programs, called multi-agent systems, can work together to tackle hard problems, e.g., autonomous traffic management in modern intelligent cities. Right now, these systems often rely on rough tricks that are hard to test, and they raise worries about safety, privacy, and reliability. This project aims to build a solid, trustworthy foundation for how multiple AI agents can cooperate safely and effectively. The plan for this project has five key steps: 1) Create a clear, repeatable way to test and compare how agent teams collaborate. 2) Improve how agents learn to coordinate with each other, so teamwork is smoother and smarter. 3) Allow agents to work together privately without losing usefulness or performance. 4) Develop defenses against clever attacks that try to poison the system or tamper with the tools the agents use. 5) Build large, scalable systems that can be audited, checked for accountability, and verified to be robust. What the project hopes to achieve: a) Open, reproducible benchmarks for measuring how well multiple AI agents collaborate. b) New methods for coordination that respect privacy and work well under constraints. c) Standardized tests and tools to evaluate how well agents resist attacks. d) Scalable architectures with clear records of decisions, so people can audit what happened. Big picture impact: This work could set a new standard for trustworthy multi-AI systems. By blending learning, planning, safety, and privacy in a transparent, reproducible framework, it aims to provide reliable tools for real-world use—from critical decision-making to complex automated tasks—where safety and accountability are crucial.