Skip to main content
AI Safety, Alignment & Ethics

⏱ About 20 min20 XP

Ethics Frameworks for AI

When a technology company's board considers whether to release a powerful new AI model, when a legislature debates an AI bill, when a researcher decides whether to publish dangerous capabilities research — each of these is a moral decision, not just a technical or business one. Moral decisions require moral reasoning, and moral reasoning has been organized into frameworks over millennia of philosophy. These frameworks are not abstract exercises. They structure the actual arguments made in AI governance debates, whether or not the participants name the framework they are using.

Consequentialism: Outcomes Are Everything

Consequentialism holds that the moral quality of an action is determined entirely by its outcomes. The most prominent consequentialist framework is utilitarianism, which defines the best action as the one that produces the greatest good for the greatest number of affected parties. Applied to AI, consequentialism says: evaluate each AI system and each governance policy by whether its deployment produces net benefit or net harm, summed across all affected parties, weighted by probability. This is structurally how cost-benefit analysis works — the dominant framework used by regulatory agencies in economic impact assessments. Consequentialism has real strengths for AI ethics. It is empirically oriented: it asks for evidence about actual outcomes rather than appealing to intuitions or rules. It is impartial across persons and cultures, making it appealing in international contexts. And it naturally accommodates uncertainty — expected value reasoning allows comparing probabilistic outcomes. Its limits are equally real. Aggregation can justify serious harms to minorities if the aggregate benefit is large enough — an AI hiring system that excludes a specific demographic might pass a utilitarian calculation if its overall employment efficiency gains are high enough. Consequentialism also requires predicting outcomes that are often highly uncertain, especially for novel AI systems with complex societal effects. And it gives no independent weight to rights, duties, or procedural fairness — only to outcomes.

Why Multiple Frameworks?

No single ethical framework provides correct answers to all moral questions. Philosophers and AI ethicists use multiple frameworks as complementary lenses: consequentialism tells you what outcomes to expect and compare; deontology tells you which constraints cannot be violated regardless of outcomes; virtue ethics tells you what character a trustworthy AI developer should cultivate; contractualism tells you what principles affected parties could reasonably accept. Where frameworks converge, you have a strong moral conclusion. Where they diverge, you have a genuine moral dilemma requiring careful judgment.

Deontological ethics holds that actions are morally right or wrong in themselves, independent of their consequences, because they respect or violate duties, rules, or rights. The most influential deontological framework is Kant's categorical imperative, which in one formulation requires that you treat persons always as ends in themselves, never merely as means. Applied to AI, deontology generates constraints that cannot be overridden by aggregate benefits. Deceiving users is wrong even if the deception produces good outcomes. Using personal data without consent violates a right even if the resulting AI service improves user wellbeing on average. Deploying AI that subjects people to surveillance without their knowledge fails the categorical imperative even if crime rates fall as a result. Deontological reasoning underlies much of EU AI governance. The EU AI Act's ban on real-time mass biometric surveillance in public spaces is explicitly framed in rights terms — such surveillance is incompatible with the right to dignity and the right to privacy — not in cost-benefit terms. Human rights frameworks in international law are structurally deontological: rights hold regardless of aggregate utility. Deontology's limits: it can produce paralysis when duties conflict (the duty not to deceive versus the duty to protect someone from harm), and it can fail to engage adequately with cases where the scale of consequences is enormous and certain. If an AI system will definitively save a million lives but requires a minor rights violation to deploy, rigid deontological constraints may require allowing the preventable deaths.

Virtue ethics, developed by Aristotle and updated by contemporary philosophers, focuses not on rules or outcomes but on the character of the moral agent. The question is not 'what action is right?' but 'what would a person of good character do?' A virtuous agent cultivates dispositions — honesty, courage, practical wisdom, justice, humility — that reliably produce good actions across novel situations. Applied to AI, virtue ethics asks what character an AI developer, deployer, or policymaker should exhibit. An honest AI company does not hide adverse findings in safety evaluations. A courageous researcher publishes results that challenge their own organization's commercial interests. A humble policymaker acknowledges the limits of their expertise and consults affected communities. These are character traits, not rules, and they matter most in situations that rules have not anticipated. Virtue ethics is particularly valuable for organizational culture in AI governance. A company whose engineers have cultivated intellectual honesty and genuine concern for affected communities will handle novel situations better than a company that follows compliance checklists without internalized ethical commitment. The organizational character of AI labs — whether they have a culture of candor about risks, willingness to delay deployment, and genuine concern for non-commercial stakeholders — is a virtue-ethics question. Contractualism, developed by philosopher T.M. Scanlon, holds that an action is morally right if it is governed by principles that no one could reasonably reject from a position of seeking fair terms of cooperation. Applied to AI, it asks: could the people affected by this AI system — including those who bear its risks without sharing its benefits — reasonably reject the principles under which it was designed and deployed? This framework is particularly powerful for identifying situations where AI benefits concentrate while harms are externalized to less powerful groups.

Flashcards — click each card to reveal the answer

Ethics in Practice: Limits and Pairings

Ethics frameworks are not algorithms — they do not produce determinate answers to every question by mechanical application. They are structured ways of identifying morally relevant considerations, organizing arguments, and finding the strongest objections to proposed actions. Competent AI ethics reasoning requires knowing which framework is most illuminating for a given situation, holding multiple frameworks in mind simultaneously, and making well-reasoned judgments when they conflict. Critically, ethics frameworks must be paired with governance mechanisms to have practical effect. An organization that has a sophisticated internal ethical framework but no external accountability — no auditors, no regulators, no legal liability — can rationalize harmful deployments within that framework indefinitely. The history of corporate self-regulation across industries shows this repeatedly. Ethics is the normative foundation of good governance; it is not a substitute for it. The combination — principled ethical reasoning embedded in enforceable governance structures — is stronger than either alone.

Ethics Washing

Ethics washing is the practice of using ethics language — publishing AI ethics principles, creating ethics advisory boards, funding ethics research — without making substantive changes to harmful practices. It is a real phenomenon in the AI industry. One signal that ethics work is genuine rather than performative: it has led to specific, costly decisions to not build, not deploy, or not monetize a capability.

An AI company argues that their content-moderation system, which removes political speech from users in certain countries at government request, is justified because it keeps their platform available to billions of users who benefit from it. Which ethical framework best supports this argument, and which framework most directly challenges it?

An AI lab publishes detailed AI ethics principles committing to fairness, transparency, and beneficence. However, it continues to deploy systems that have failed internal bias audits, does not publish those audit results, and funds ethics research exclusively through its own charitable foundation. What concept does this pattern illustrate?

Ethical Framework Debate

  1. Your class will evaluate the following scenario using all four frameworks:
  2. Scenario: A government health ministry proposes deploying an AI model that predicts which individuals are at high risk of suicide within the next 12 months, based on their medical records, prescription histories, and — with their consent — social media activity. Identified high-risk individuals receive proactive outreach from mental health services. In clinical trials, the system reduced suicide rates by 22% in the intervention group. However, civil liberties groups raise concerns about the privacy implications of mental health profiling and the potential for discriminatory impacts on vulnerable populations.
  3. In four groups, apply your assigned framework to the scenario:
  4. Group 1 — Consequentialist: What evidence would you need to evaluate this proposal? How do you weigh the 22% reduction against the privacy costs?
  5. Group 2 — Deontological: Which rights are at stake? Can any outcome justify the violation of those rights?
  6. Group 3 — Virtue ethics: What character should a government or health ministry cultivate in making this decision? What would a humble, courageous, just policymaker do?
  7. Group 4 — Contractualist: Could the individuals whose data is used and analyzed reasonably reject the principles under which this system operates?
  8. Present your analysis. Then discuss as a class: where do the frameworks agree? Where do they conflict? What decision would you make, and why?