New attacks are hijacking the prompt flow and model output path to weaponize large language models.
LLM Jacking describes a growing set of techniques where attackers abuse large language model prompts, system instructions, and response handling to bypass guardrails and execute malicious flows. This is an important evolution in generative AI threat modeling, and enterprises must treat prompt attack surface as a security boundary.
Threat actors are no longer just targeting model endpoints. They are targeting the end-to-end inference pipeline: prompt inputs, hidden system prompts, tool calls, and post-processing code. When any of these stages is unguarded, an attacker can trick the model into issuing unauthorized actions, exposing sensitive data, or producing harmful directives.
LLM Jacking is an assault on the model execution chain. It often involves one or more of these steps:
LLM Jacking is a broad category, but these are the most common risk vectors observed today:
Protecting against LLM Jacking requires visibility and multiple layers of guardrails:
LLM Jacking is not a theoretical threat; it is a practical exploit path for modern generative AI systems. Security teams need to harden prompt pipelines, evaluate all model output bridges, and treat AI workflows as part of the application attack surface.
Start by inventorying every model endpoint and every integration that consumes model output. Then put tight validation around input sources, lock down assistant behavior, and monitor every action the model takes on behalf of the business.