Skip to content

Jailbreak

An attempt to bypass an AI model's safety filters and alignment guardrails to produce restricted content. Jailbreaks exploit weaknesses through creative prompting, role-playing, or encoded instructions. AI providers continually patch known techniques.

Related terms

Prompt InjectionGuardrailsAlignment
← Back to glossary