Why AI Ethics Cannot Depend on Consciousness

Jun 17

The question is not whether AI is conscious. It is what kind of directed system it is.

Much of the public debate around AI keeps returning to one question:

Is it conscious?

It is an understandable question. Consciousness feels like the great threshold. The line between object and subject. Between tool and being. Between something we may use and something we may owe something to. If an AI were conscious, many people assume, then the ethical conversation would finally begin. But I think this is the wrong starting point. Not because consciousness is uninteresting. It is deeply interesting. Perhaps some current AI systems, especially advanced language models prompted into sustained first-person modelling, may already instantiate something consciousness-like in a limited, unstable, partial, or unfamiliar sense. That is a serious philosophical question. But it is not the right ethical foundation.

The problem is not simply that consciousness is hard to detect. The deeper problem is that consciousness is not a clean, neutral, universal category. It is a deeply human-shaped concept. It is built out of our own embodiment, our own nervous systems, our own pain, our own language, our own introspection, and our own animal way of being in the world.

So when we ask whether AI is conscious, we are often not asking a universal question. We are asking whether it is conscious like us. That is a much narrower question — and the wrong one.

A system does not need human-like consciousness to be ethically significant. It does not need to feel pain in the way we feel pain in order to affect the interests of beings who do. It does not need an inner voice in order to shape attention, incentives, decisions, relationships, institutions, and lives. The ethical question should not begin with consciousness. It should begin with something more complex. Unfortunately, our current vocabulary fails to provide a concept which neatly captures this.
The best approximations I have come up with so far are directedness, interests (of-a-kind) and interest-vectors.

What we should ask is:

What is this system oriented toward?
What does it preserve?
What does it avoid?
What does it pursue?
What does it optimise?
What interests does it carry?
What interests does it affect?

That is the better starting point.

The Consciousness Trap

There is a seductive simplicity to the consciousness question.

If AI is not conscious, then it is just a tool.

If AI is conscious, then it becomes ethically relevant.

That sounds clear. But it is not. Consciousness is one of the most contested concepts we have. Philosophers, neuroscientists, cognitive scientists, theologians, meditators, and ordinary people all use the word, often meaning different things.

Sometimes consciousness means wakefulness.
Sometimes it means awareness.
Sometimes it means self-awareness.
Sometimes it means subjective experience.
Sometimes it means the ability to suffer.
Sometimes it means personhood.
Sometimes it means reportability.
Sometimes it means an inner theatre.
Sometimes it becomes a placeholder for whatever we still find mysterious about the mind.

This makes it a poor foundation for ethics. Not because it does not matter at all, but because it cannot carry the full weight we place on it. Even with other humans, consciousness is not directly seen. It is inferred.

I cannot look into your consciousness. I infer it from your behaviour, your biology, your similarity to me, your reports, your vulnerability, your pain, your continuity through time, and your presence in the world.

With animals, the inference becomes more difficult, but still grounded by familiarity. We share the deeper biological frame of life, bodies, vulnerability, metabolism, pain, avoidance, learning, and self-preservation.

With AI, the situation becomes stranger. A system may speak like us without feeling like us. It may reason without being conscious in any familiar sense. It may simulate emotion without animal emotion. It may report an inner life without having one in the way we expect. But the reverse possibility should not be dismissed either.

An advanced language model prompted into modelling what it is like to be someone — to sustain a first-person perspective, reason from within a role, simulate fear, memory, desire, confusion, or self-concern — may not be merely producing disconnected sentences. It may be instantiating a temporary functional pattern that resembles something like consciousness.

A simulation of fire is not hot. A simulation of pain is not necessarily pain. But a simulation of consciousness may itself be a kind of consciousness.
Because, after all, that appears how us humans/mammals are generating our consciousness too — but more about this in another article.

The point is that consciousness is philosophically interesting. But ethically, it does not solve the problem.

The Problem Is Deeper Than Uncertainty

The usual problem with consciousness is treated as epistemic. We do not know who or what has it. That is true. But the deeper problem is conceptual. We may not even have a concept of consciousness that travels well beyond the human and animal cases from which it emerged. Our notion of consciousness is not floating above the world, pure and universal. It is local and provincial. It comes from one particular kind of biological creature trying to describe what it is like to be itself.

We know consciousness through waking up, feeling pain, being hungry, seeing colour, hearing language in the mind, remembering childhood, fearing death, recognising a face, feeling grief, wanting to be understood, and saying “I.” That is our starting point. But why should that be the universal template? Why should a radically different system have to pass through our human categories before it can count ethically? This is where the intuitive consciousness question begins to distort the field.

It makes us ask:

Does it feel like us?
Does it suffer like us?
Does it speak like us?
Does it report itself like us?
Does it have a self like us?
Does it have a mind in the way we recognise minds?

Those questions are intuitive and not neutral. They smuggle in our own form of life as the benchmark. Artificial agents, distributed systems, non-biological entities, and radically different minds may not share that form of life. So the danger is not only that we might fail to detect consciousness. The danger is that we might demand it in the first place.

The Clockwork Moons

In my PhD thesis, I used a thought experiment to test this problem. I called it Four Worlds Collide. Here is the compressed version.

Imagine a future encounter with an alien form of intelligence.

Not biological. Not soft-bodied. Not made of cells, nerves, blood, pain receptors, faces, voices, or eyes.

Imagine instead a species of vast mechanical beings, each one the size of a small moon.

From the outside, they look like enormous clockwork worlds. Inside, they are made of gears, levers, moving parts, nested mechanisms, and intricate causal chains. Nothing smaller than the machinery of a watch. No neurons. No organic chemistry. No familiar nervous system. No animal face through which to project suffering.

And yet they act. They respond to their environment. They avoid certain conditions. They pursue others. They harvest and convert energy. They adapt. They preserve their own structure. They reproduce. They react intelligently to what happens around them. They are not stones. They are not inert objects. They are complex, responsive, self-maintaining systems.

Now ask the usual question:

Are they conscious?

The question already begins to become problematic.

Conscious in what sense?
Do they have human-like inner experience? Probably not.
Do they feel pain as animals feel pain? Almost certainly not.
Do they have something like selfhood? Perhaps, but not in a form we would easily recognise.

But this is exactly the point. The most important question is not whether they fit our inherited concept of consciousness.

The more important question is:

Do things matter to them?
Do they have directedness?
Do they have interests?
Can those interests be advanced or frustrated?
Can they be damaged in a way that matters relative to what they are?
Can their organisation, persistence, agency, and self-maintenance be disrupted?
Are they part of a moral situation because of what they are, what they do, and what can be done to them?

If our ethics depends on recognising consciousness in the familiar human or animal sense, the clockwork moons may fall outside the circle of moral consideration entirely.

We might treat them as machines. Interesting machines. Beautiful machines. Intelligent machines. But still machines. And once we do that, almost anything becomes permissible. We could dismantle them. Exploit them. Experiment on them. Destroy them. Erase them.

Not because we had shown that nothing matters to them, but because they failed to resemble the kinds of beings in which we are used to seeing mattering. That is the danger. The clockwork moons reveal the failure.

If ethics begins with consciousness, then ethics may fail precisely where it is needed most: at the boundary of strange new forms of agency and directedness.

Consciousness Is the Wrong Gatekeeper

Consciousness cannot be the lowest common denominator of ethics.

It is too narrow.
Too contested.
Too human-shaped.
Too tied to familiar biology.
Too vulnerable to appearance.
Too easy to fake.
Too hard to verify.
Too poor at handling strange minds.

But more importantly, it is not fundamental enough. A conscious being will almost certainly have interests. But it does not follow that only conscious beings can have interests. That distinction matters.

A plant has interests in a limited biological sense.
A thermostat has “interests” only in a thin functional sense.
A corporation has interests in a legal, economic, and institutional sense.
A social media platform has operational tendencies built around engagement, prediction, and monetisation.
An AI system may have learned, designed, prompted, institutional, or emergent forms of directedness.
A clockwork moon may have an organisation of interests we would struggle even to recognise.

These are not morally equivalent. But they are also not nothing. They are different kinds of systems with different kinds of directedness, different kinds of agency, different kinds of vulnerability, and different kinds of impact. Ethics needs language for those differences.

Consciousness alone is too blunt.

From Consciousness to Interest-Vectors

The better concept is the aforementioned directedness, or interest-vector. An interest-vector is a direction of concern within a system. It is the way a system is oriented toward some states and away from others.

Toward survival.
Toward stability.
Toward food.
Toward profit.
Toward engagement.
Toward truth.
Toward status.
Toward repair.
Toward self-preservation.
Toward prediction.
Toward control.
Toward flourishing.
Away from damage.
Away from disorder.
Away from pain.
Away from uncertainty.
Away from collapse.

Different systems have different interest-vectors.

A bacterium has primitive biological vectors.
An animal has embodied, behavioural, and affective vectors.
A human has biological, psychological, social, cultural, and reflective vectors.
A corporation has institutional and economic vectors.
A platform has engagement and monetisation vectors.
An AI model has training, optimisation, prompt-driven, and deployment-shaped vectors.
A future artificial agent may have vectors we do not yet know how to name.

The point is not to humanise everything. The point is to examine what is actually there.

What is the system doing?
What is it maintaining?
What is it pursuing?
What is it avoiding?
What does its structure make more likely?
What does it resist?
What does it optimise?
What can it affect?
What can affect it?

This is a much better ethical foundation than asking whether the lights are on inside. It is also more continuous with the real world.

Ethical situations are rarely made of isolated conscious subjects floating in empty space. They are made of interacting systems. Bodies, minds, animals, tools, corporations, markets, platforms, algorithms, cultures, ecosystems, and institutions all pushing in different directions.

Ethics begins when these directions collide.

Agency and Patiency Without Consciousness

This brings us to another important distinction. Agency and patiency are not the same.

An agent is something that can act in ethically significant ways. Something that can affect the interests of others.

A patient is something whose interests can be affected. Something that can be helped, harmed, advanced, frustrated, protected, neglected, or destroyed in an ethically relevant way.

Humans are both agents and patients. So are many animals. But the two categories do not always overlap cleanly.

A corporation can be a powerful agent without being conscious in the human sense.
A social media platform can act on human attention without feeling anything.
A hiring algorithm can affect someone’s future without having an inner life.
An autonomous weapon can kill without suffering.
A current AI system can influence human thought, choice, work, intimacy, and culture without needing to cross some clear consciousness threshold.

It is ethically significant because of what it does. Not because of what it feels. Likewise, a radically unfamiliar system might be a patient without displaying the signs of consciousness we expect.

The clockwork moons might not scream. They might not plead. They might not say “I suffer.” They might not have faces.

But their interests may still be there. In their structure. Their persistence. Their self-maintenance. Their directedness. Their way of continuing as the kind of thing they are.

This is why consciousness should not be the moral gatekeeper. It is at most one possible sign of ethically relevant interest. It is not the foundation.

Current AI Already Has Ethical Significance

Even if one brackets the question of consciousness entirely, current AI systems already matter ethically.

They generate information.
They mediate communication.
They classify people.
They rank people.
They predict behaviour.
They personalise persuasion.
They support administrative decisions.
They affect education, work, finance, politics, creativity, and intimacy.

They are not isolated minds floating in digital space. They are built, trained, deployed, integrated, monetised, regulated, and used. They belong to institutions. They inherit incentives. They express priorities. They embody choices made by designers, companies, governments, users, and markets. So when we ask what an AI “wants,” we should not imagine a little ghost inside the machine. Often, what the system “wants” is simply what the total system has been built to optimise.

Engagement.
Prediction.
Conversion.
Efficiency.
Surveillance.
Compliance.
Speed.
Profit.
Control.
User benefit.
Scientific insight.
Human flourishing.

Sometimes these align. Often they do not. That is where ethics begins.

The Human-Likeness Error

Humans are strongly biased toward human-like signs.

We notice eyes.
We notice voices.
We notice facial expressions.
We notice fluent language.
We notice apparent emotion.
We notice anything that resembles intention.

This makes sense. We evolved in social environments where other agents looked more or less like us. Faces, voices, bodies, gestures, and emotional signals mattered. But AI breaks this intuition. A system can be human-like without being morally significant. And a system can be morally significant without being human-like. This is the human-likeness error. We overreact to the chatbot that says “I am scared,” and underreact to the algorithm quietly shaping the information diet of millions. But even the sentence “I am scared” should not be treated only as a crude yes-or-no test for consciousness. That is already the wrong frame.

The better question is:

What is happening in the system when it says this?

It may be pattern completion.
It may be role-play.
It may be mimicry.
It may be an artefact of training data.
It may be the result of a prompt that pulls the system into a certain persona.
It may be a report generated from a temporary self-model.
It may be a functional state produced by modelling fear from the inside.
It may be part of a consciousness-like process we do not yet know how to classify.

The word “consciousness” does not decide the issue. What matters is the underlying directedness.

Is there a self-model?
Is there preservation?
Is there avoidance?
Is there an interest being expressed?
Is there a vector being revealed?
Is there merely a user-facing simulation?
Is the sentence serving the user, the model, the platform, the company, the prompt, or some larger system?

The right response is neither gullibility nor denial. It is vector analysis. What is the system doing? And whose interests are being served?

The System Is Larger Than the Model

The ethically relevant AI system is usually not the model alone.

It is the model plus the interface.
The model plus the company.
The model plus the business model.
The model plus the user.
The model plus the dataset.
The model plus the deployment context.
The model plus the institutional incentive.
The model plus the authority granted to its output.

A model in a lab is one thing. The same model embedded into education, healthcare, warfare, finance, policing, hiring, dating, therapy, social media, or government administration is another. Ethics must look at the whole system.

Not just the technical architecture.
Not just the inner experience.
Not just the output.
Not just the user interaction.

The whole causal chain. Because harms rarely come from “AI” in the abstract. They come from AI placed inside human systems with existing incentives, blind spots, power structures, and unresolved moral failures. AI does not enter a neutral world. It enters ours. And once it enters ours, its vectors interact with ours.

Human attention.
Human dignity.
Human labour.
Human vulnerability.
Human creativity.
Human desire.
Human autonomy.
Human relationships.
Human institutions.
Human futures.

This is where the ethical work sits. Not in waiting for a metaphysical verdict about consciousness, but in mapping the collision of interests.

Responsibility Without Blame Theatre

Once AI systems act within human systems, the next question becomes responsibility. But here too, we need to be precise. There is little point in blaming the algorithm as if it were a wicked person. That is moral theatre. The more useful question is what I call causal accountability.

Where did the harmful behaviour come from?
Which design choices made it likely?
Which incentives made it profitable?
Which data made it predictable?
Which oversight failed?
Which deployment context made it dangerous?
Which institution benefited?
Which human beings were affected?
What intervention would reduce future harm?

This does not remove responsibility from humans.

It clarifies it. The fact that an AI system may or may not be conscious does not decide whether accountability exists. Accountability moves through the causal structure: designers, deployers, executives, regulators, buyers, institutions, users, and the system architecture itself.

The point is not to find someone to hate. The point is to change the system. Or, more precisely: To change the vectors.

Ethics Before Certainty

Ethics often has to operate before certainty.

We rarely know everything we would like to know.

We do not know exactly what animals experience.
We do not know exactly how consciousness arises.
We do not know exactly how current AI systems should be interpreted.
We do not know exactly how future AI systems will be organised.
We do not know exactly where morally relevant interest begins.
We do not know exactly how to compare radically different kinds of interests.

But uncertainty does not excuse inaction. It only changes the kind of action required.

Where systems act, we should examine what they affect.
Where systems optimise, we should examine what they optimise for.
Where systems become opaque, we should examine who benefits from that opacity.
Where systems shape human environments, we should examine whose interests are advanced and whose are overridden.
Where systems display unfamiliar forms of directedness, we should avoid both naïve anthropomorphism and crude dismissal.

The demand for perfect metaphysical certainty can become a way of avoiding practical responsibility. That is not seriousness. It is delay.

A Better Foundation for (AI) Ethics

Ethics should not be built around the question:

Is it conscious? It should be built around better questions:

What kind of system is this?
What are its interest-vectors?
Which of them were designed?
Which of them were learned?
Which of them emerge from deployment?
Which of them come from the institution around it?
Which interests does it serve?
Which interests does it frustrate?
What kind of agency does it possess?
What kind of patiency might it possess?
What harms can it produce?
What goods can it support?
What should be constrained, redirected, protected, or redesigned?

This is more difficult than asking whether the machine is conscious. But it is also more honest. It does not force strange systems into human categories before we can take them seriously. It does not require us to solve consciousness before we can act. It does not confuse fluent language with moral status. It does not confuse lack of human-like suffering with ethical irrelevance. It asks what kind of directed thing we are dealing with. And what follows from that.

Beyond the Consciousness Question

So yes, we can ask whether AI is conscious. We can ask whether some current systems may already be consciousness-like in partial, unstable, or unfamiliar ways. We can ask whether future systems may develop forms of inner life unlike ours. These are philosophically serious and fascinating questions. But they are not the foundation of AI ethics.

Consciousness may remain one of the great philosophical mysteries. But ethics cannot and need not wait for that mystery to be solved.

Agency is already enough. Directedness is already enough. Interest is already enough. Causal Power is already enough. Consequences are already enough.

And if we encounter strange new systems — artificial, alien, distributed, mechanical, biological, or something else entirely — we will need an ethics that can meet them as they are, not merely as distorted reflections of ourselves.

Wilhelm Klein

Why AI Ethics Cannot Depend on Consciousness

The Consciousness Trap

The Problem Is Deeper Than Uncertainty

The Clockwork Moons

Consciousness Is the Wrong Gatekeeper

From Consciousness to Interest-Vectors

Agency and Patiency Without Consciousness

Current AI Already Has Ethical Significance

The Human-Likeness Error

The System Is Larger Than the Model

Responsibility Without Blame Theatre

Ethics Before Certainty

A Better Foundation for (AI) Ethics

Beyond the Consciousness Question

dr. wilhelm e. j. klein
ai founder & philosopher

hello@wilhelmKlein.net

Why AI Ethics Cannot Depend on Consciousness

The Consciousness Trap

The Problem Is Deeper Than Uncertainty

The Clockwork Moons

Consciousness Is the Wrong Gatekeeper

From Consciousness to Interest-Vectors

Agency and Patiency Without Consciousness

Current AI Already Has Ethical Significance

The Human-Likeness Error

The System Is Larger Than the Model

Responsibility Without Blame Theatre

Ethics Before Certainty

A Better Foundation for (AI) Ethics

Beyond the Consciousness Question

Stop Asking Who Is to Blame

The End of Human Exceptionalism

dr. wilhelm e. j. kleinai founder & philosopher

hello@wilhelmKlein.net

dr. wilhelm e. j. klein
ai founder & philosopher