Strategic Brief

Prompt Injection Beyond Filtering

Why resilient AI security must reduce the consequences of successful influence.

Evening Star AIAdversarial AI securitySystem design

Abstract

Prompt injection is often framed as a content-filtering problem. That framing is too shallow. In tool-using systems, prompt injection is better understood as an influence problem: untrusted content tries to alter the behavior of a system that has access to memory, policies, tools, or users.

Brief Thesis

Filtering helps, but resilient design must assume imperfect detection. The system should separate data from instruction authority, constrain what untrusted content can induce, scope sensitive context, mediate tools, and require escalation when influence meets capability.