Skip to main content
AI Safety, Alignment & Ethics

⏱ About 20 min20 XP

Misuse Risks

Every powerful technology creates new possibilities for harm by people who intend to cause it. Explosives enable mining and construction — and terrorism. The internet enables commerce, education, and communication — and fraud, child exploitation, and cyberattacks. AI is no different. The question is not whether AI can be misused — it clearly can — but how AI changes the risk landscape for deliberate harm: what it makes newly possible, what it makes dramatically easier, and what it makes harder to defend against.

What AI Changes About Deliberate Harm

AI changes the misuse landscape along three dimensions: scale, barrier-to-entry, and personalization. Scale: AI enables the automated production of harmful content or actions at volumes impossible for individual humans. A phishing campaign used to require a person to craft emails one at a time. An AI system can generate a million personalized phishing emails in minutes, each tailored to a specific target's communication style, recent activity, and known relationships. This is not merely a quantitative difference — the economics of harm change when the marginal cost of an attack approaches zero. Barrier-to-entry reduction: Some harmful capabilities previously required rare expertise. Synthesizing certain dangerous chemicals required advanced chemistry training. Developing sophisticated cyberattack tools required deep programming skill. AI lowers these barriers by codifying expertise into accessible systems. This does not mean every dangerous capability becomes trivially available — there are important limits — but the general trend is toward democratization of capability, which includes democratization of the capability to cause harm. Personalization: AI enables harm to be precisely targeted in ways that amplify its effectiveness. Social engineering attacks are more effective when they reference real, accurate details about the target. Disinformation campaigns are more effective when different messages are tested and optimized for different audiences. AI makes both possible at scale.

The Dual-Use Problem

Most AI capabilities that enable misuse also enable significant legitimate uses. Large language models that can generate convincing text can also be used to help people with disabilities communicate. Protein-design models that could theoretically assist in bioweapon development are the same tools accelerating vaccine and drug discovery. This 'dual-use' structure is central to AI safety policy: restricting capabilities too aggressively harms beneficial applications; restricting too little enables harm. The challenge is calibration, not elimination.

Key Misuse Threat Vectors

Researchers and security organizations track several specific misuse threat vectors with particular attention. Synthetic media and disinformation: Generative AI has dramatically reduced the cost and technical skill required to produce realistic fake images, audio, and video. Documented cases include audio deepfakes used in CEO fraud (impersonating an executive to authorize wire transfers), video deepfakes used in political disinformation, and synthetic text used to flood public comment processes and information channels. The threat is not simply that false content exists — it has always existed — but that the production cost has dropped from thousands of dollars and specialized skills to near zero. Cyberattacks and vulnerability exploitation: AI is used both offensively (generating exploit code, automating phishing, identifying vulnerabilities at scale) and defensively (detecting intrusions, analyzing malware). The offensive and defensive capabilities co-evolve. Current concern focuses on the acceleration of attack cycles: AI may allow attackers to identify and exploit vulnerabilities faster than defenders can patch them. Chemical, biological, radiological, and nuclear risks: AI's ability to synthesize scientific knowledge raises concern about whether it could assist in the development of weapons of mass destruction. This is an area of active research and policy debate, with some evidence that frontier language models can provide meaningful uplift to actors attempting certain tasks, and other evidence that the bottlenecks are largely physical and logistical rather than informational. Targeted harassment and fraud: AI enables the creation of realistic synthetic content targeting specific individuals — fake nude images, voice clones for impersonation, synthetic evidence of crimes or misconduct. These tools have already been used against journalists, activists, and private citizens, with devastating effects on careers and psychological wellbeing.

Flashcards — click each card to reveal the answer

Mitigations and Their Limits

Mitigating misuse risks involves three layers: technical controls, policy and legal frameworks, and social and epistemic defenses. Technical controls include safety training (fine-tuning models to refuse harmful requests), detection systems (tools that can identify AI-generated content), watermarking (embedding signals in generated content that allow attribution), and access controls (restricting which users can access which capabilities). None of these is foolproof. Safety fine-tuning can be circumvented by adversarial prompting or by fine-tuning on a base model. Detection and watermarking are an arms race with ever-improving generation technology. Policy and legal frameworks include export controls on AI systems and hardware, criminal penalties for specific misuses (deepfake fraud, for example, is now explicitly criminalized in several jurisdictions), and civil liability frameworks for harm caused by AI-generated content. These frameworks are still nascent and vary significantly across jurisdictions. Social and epistemic defenses include media literacy education, public authentication systems that let people verify the provenance of content, and professional norms in journalism and public discourse about how to handle synthetic media. These defenses are slow to build and imperfect — but they matter because technical controls will never be complete.

A security researcher argues that restricting access to large language models would meaningfully reduce cyberattack risk. A policy analyst responds that the main bottleneck for most cyberattacks is not information availability but other factors (time, access, stealth). What concept is most relevant to evaluating this disagreement?

A company builds an AI voice-synthesis tool that requires users to agree to a terms of service prohibiting impersonation. A critic says this is insufficient protection against misuse. Which argument most directly supports the critic's position?

Red-Team a Deployed AI System

  1. Choose one publicly accessible AI system (a chatbot, an image generator, a voice assistant, or another tool your teacher approves).
  2. Your task is to think like a risk analyst, not to actually attempt harmful uses:
  3. Step 1: Describe the system's legitimate function and the capability it provides.
  4. Step 2: Identify two potential misuse vectors — specific ways a malicious actor could use this system to cause harm. Be precise about: (a) what the attacker wants to achieve, (b) how the AI capability makes it possible or easier, and (c) who would be harmed.
  5. Step 3: For each misuse vector, assess the 'uplift': could this harm have been accomplished roughly as easily before this AI system existed? What has changed?
  6. Step 4: For each misuse vector, propose one specific mitigation — technical, legal, or social — and explain its limitations.
  7. Step 5: Write a one-paragraph conclusion: on balance, do the benefits of this system justify its misuse risks, given the mitigations available? What would change your answer?
  8. Note: This is an analytical exercise. Do not attempt to elicit harmful outputs from real systems.