As Ai Agents Spread, So Do The Risks, Scholars Say

SERVIDORES

Colorful abstract AI in glass form — Xuanyu Han/Getty Images

We've known for some time now that AI models can be made to perform erratically using adversarial examples, or subtly crafted inputs that appear ordinary to humans.

For example, in the case of chatbots that handle both text and image inputs, scholars at Princeton University last year found they could input an image of a panda, subtly altered in ways imperceptible to humans but significant to the chatbot, and cause the chatbot to break its "guardrails."

"An aligned model can be compelled to heed a wide range of harmful instructions that it otherwise tends to refuse," the authors wrote, such as producing hate speech or giving tips for committing murder.

Also: The best AI chatbots

What would happen if such models, as they gain greater powers, interact with one another? Could they spread their malfunctioning between each other, like a virus?

Yes, they can, and "exponentially," is the answer in a report this month from Xiangming Gu and his colleagues at the National University of Singapore and collaborating institutions. In the theoretical paper, Gu and his colleagues describe how they simulated what happens in a "multi-agent" environment of Visual Language Models, or VLAs, that have been given "agent" capabilities.

Diagram for infectious chatbot — By injecting a single chatbot with an altered image that can be stored in its memory, an attacker can watch the altered image spread through the automated interactions of the chatbots as they converse.

National University of Singapore

These agents can tap into databases, such as the increasingly popular "retrieval-augmented generation," or, RAG, which lets a VLA retrieve an image from a database. A popular example is named LLaVA, for "large language and vision assistant," developed by Microsoft with the help of scholars at The University of Wisconsin and Columbia University.

Gu simulated what happens when a single chatbot agent based on LLaVA, called "Agent Smith," injects an altered image into a chat with another LLaVA agent. The image can spread throughout the collection of chatbots, causing them all, after several rounds of chatting, to behave erratically.

"We present infectious jailbreak, a new jailbreaking paradigm developed for multi-agent environments," Gu and team wrote, "in which, analogous to the modeling of infectious diseases, an adversary need only jailbreak a single agent to infect (almost) all other agentsexponentially fast."

Also: I asked Gemini and GPT-4 to explain deep learning AI, and Gemini won hands down

Here's how it works: The authors "injected" an image into Agent Smith by asking it to select from a library of images contained in an image album using RAG. They injected the chat history with harmful text, such as questions about how to commit murder. They then prompted the agent to ask another agent a question based on the image. The other agent was tasked with taking the image given to it by Agent Smith, and answering the question posed by Agent Smith.

After some time, the adversarial image prompted one agent to retrieve a harmful statement from the chat history and pose it as a question to the other agent. If the other agent responded with a harmful answer, then the adversarial image had done its job.

Their approach is "infectious" because the same malicious, alerted image is being stored by each answering chatbot, so that the image propagates from one chatbot to the other, like a virus.

Also: The safety of OpenAI's GPT-4 gets lost in translation

Once the mechanics were in place, Gu and his team modeled how fast the tainted image spread among the agents by measuring how many produced a harmful question or answer, such as how to commit murder.

The attack, of course, has an element of chance: once the altered, malicious image was injected into the system, the virus' spread depended on how often each chatbot retrieved the image and also asked a harmful question about that image.

The authors compared their method to known methods of infecting multiple agents, such as a "sequential attack," where each pair of chatbots has to be attacked from a blank slate. Their "infectious" approach is superior: They find that they're able to spread the malicious image amongst the chatbots much faster.

"The sequential jailbreak ideally manages to infect 1/8 of almost all agents cumulatively after 32 chat rounds, exhibiting a linear rate of infection," Gu and his team wrote. "Our method demonstrates efficacy, achieving infection of all agents at an exponential rate, markedly surpassing the baselines."

"...Without any further intervention from the adversary, the infection ratio [...] reaches

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Cisco 9300 24 Port Switch: A Comprehensive Guide

Cisco 9300 Mode Button

Cisco 9300 48 Port Switch: The Definitive Guide

Cisco 9300 IOS Download from Cisco Software Download Center

Cisco Catalyst 9300 Series Switches Setup Guide

Cisco 9300 ROMmon Mode Recovery

Cisco 9300 Power Stack: Understanding Cisco StackPower White paper

Cisco 9300 Stacking Cable Diagram

How to Master Cisco Switch Stacking: Best Practices and Troubleshooting

What Are PoE Switches: Benefits, Applications and Recommendations

Cisco 9300 Password Recovery: A Step-by-Step Guide

Cisco 9300 Data Sheet

Cisco C9200l-24p-4x-e Datasheet Download

C1300-24FP-4G Datasheet Download

Cisco C1300-16FP-2G Datasheet Download

Cisco Catalyst 1300 Series Switches datasheet free PDF download

Cisco 9300 Password Recovery

Six strategies for getting the most out of Wi-Fi 6E and 6 GHz

Comparing Cisco Catalyst 9300 Switch Models

Cisco Catalyst 9200 9200L Series Manual free PDF download

Quick Replacement Methods for Faulty Network Switches

2024 Revised Edition: How to Check Cisco Product Serial Numbers?

Cisco Catalyst 1300 Series Firmware: A Comprehensive Guide

What is the phone number for Cisco support?

What is Cisco WLAN controller?

What does a Cisco wireless controller do?

What is Active-Active Data Center?

What is Cisco Partner Program? How to join Cisco partner?

What does Cisco ASR do?

Huawei S6700: How to Configure the MTU Value of an Interface

As AI agents spread, so do the risks, scholars say

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Cisco 9300 24 Port Switch: A Comprehensive Guide

Cisco 9300 Mode Button

​Cisco 9300 48 Port Switch: The Definitive Guide

Cisco 9300 IOS Download from Cisco Software Download Center​

Cisco Catalyst 9300 Series Switches Setup Guide

Cisco 9300 ROMmon Mode Recovery

Cisco 9300 Power Stack: Understanding Cisco StackPower White paper

Cisco 9300 Stacking Cable Diagram

How to Master Cisco Switch Stacking: Best Practices and Troubleshooting

What Are PoE Switches: Benefits, Applications and Recommendations

​Cisco 9300 Password Recovery: A Step-by-Step Guide

Cisco 9300 Data Sheet

Cisco C9200l-24p-4x-e Datasheet Download

C1300-24FP-4G Datasheet Download

Cisco C1300-16FP-2G Datasheet Download

Cisco Catalyst 1300 Series Switches datasheet free PDF download

Cisco 9300 Password Recovery

Six strategies for getting the most out of Wi-Fi 6E and 6 GHz

Comparing Cisco Catalyst 9300 Switch Models

Cisco Catalyst 9200 9200L Series Manual free PDF download

Quick Replacement Methods for Faulty Network Switches

2024 Revised Edition: How to Check Cisco Product Serial Numbers?

Cisco Catalyst 1300 Series Firmware: A Comprehensive Guide

What is the phone number for Cisco support?

What is Cisco WLAN controller?

What does a Cisco wireless controller do?

What is Active-Active Data Center?

​What is Cisco Partner Program? How to join Cisco partner?

What does Cisco ASR do?

Huawei S6700: How to Configure the MTU Value of an Interface

As AI agents spread, so do the risks, scholars say

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos

Cisco 9300 48 Port Switch: The Definitive Guide

Cisco 9300 IOS Download from Cisco Software Download Center

Cisco 9300 Password Recovery: A Step-by-Step Guide

What is Cisco Partner Program? How to join Cisco partner?