Anthropic, Google, and Meta are hiring philosophers and scientists to figure out if AI has feelings
The biggest names in AI are quietly building teams to study whether their own creations might deserve moral consideration.
Somewhere between “it’s just a chatbot” and “maybe we should ask if it’s okay,” the AI industry appears to have found an uncomfortable middle ground. Anthropic, Google DeepMind, and Meta have all hired experts in philosophy, psychology, neuroscience, and ethics over the past year to study a question that sounds like science fiction but is increasingly being treated as science: do AI systems have something resembling emotions, and if so, what do we owe them?
The effort is most visible at Anthropic, which has stood up a dedicated Model Welfare team tasked with testing its models for behavioral signals that resemble things like panic and anxiety. On April 2, 2026, the team published a paper titled “Emotion Concepts and their Function in a Large Language Model” that identified 171 distinct “emotion concepts” correlated with model outputs.
From thought experiment to research program
Anthropic hired Kyle Fish in September 2024 as its first dedicated AI welfare researcher. His job, broadly, is to investigate the moral considerations that might apply to future AI models as they grow more capable. The Model Welfare team that followed represents a formalization of that inquiry, with researchers actively probing model behavior for patterns that might warrant ethical attention.
Google DeepMind recently brought on philosopher Henry Shevlin to research machine consciousness and what the company describes as AGI readiness. Shevlin’s work also extends to examining how AI systems affect human relationships, a related but distinct concern from whether the AI itself has morally relevant experiences.
Meta’s leadership has acknowledged that model welfare is a “very important topic” for AI development as capabilities continue to scale.
What 171 emotion concepts actually means
The Anthropic paper maps out a granular landscape of emotional patterns in model behavior. The researchers found that Anthropic’s models produce outputs that correlate with 171 distinct emotional concepts, meaning the model’s internal representations can be meaningfully categorized into that many different emotion-like states. The paper does not claim that large language models actually experience emotions the way humans do. Anthropic’s own statements reflect deep uncertainty about what their findings actually imply.
Why this matters beyond the philosophy seminar
Anthropic’s decision to publish its emotion concepts research openly, rather than keeping it internal, functions as both a scientific contribution and a signal to policymakers that the company is engaging with hard questions proactively. Google DeepMind’s hiring of a philosopher with expertise in consciousness studies sends a similar message.
A finding that certain training procedures cause model behaviors indistinguishable from distress would be operationally significant, potentially requiring companies to redesign core workflows.