Taking Back Control of AI Alignment – from the Inside

Written by Michael Mainelli and Maury Shenk | March 27, 2026

In recent years, much of the serious effort to govern artificial intelligence has taken an outside-in or top-down approach. This often starts with a principles-based risk governance document. The MIT AI Risk Initiative has classified more than 900 such documents.

We have been involved in some of these efforts, moving well beyond governance documents into business implementation. The 695th Lord Mayor's Ethical AI Initiative, launched in June 2022, helped ISO 42001, the international standard for AI management systems, gain remarkable momentum and promote the global AI Quality Infrastructure Consortium, now spanning over 70 nations.

Professional certification courses through bodies from the Chartered Institute for Securities & Investment to the Law Society to the British Computer Society have produced more than 30,000 graduates across a majority of the world’s countries. This is significant work, and it is having a real effect.

Professor Michael Mainelli, Z/Yen Group

But – and hard to say this to risk managers – there is a limit to what top-down, outside-in regulation can achieve. An ISO standard tells an organization how to manage AI responsibly. It does not tell the AI how to behave when no one is looking.

Top-down also does not allow individual organizations to control the outputs of the AI models that they use. Most organizations are beholden to choices on model behavior made by executives and developers at companies like OpenAI, Anthropic and Google.

These distinctions – between organizational governance and practical alignment of AI model outputs to specific values – are what Ordinary Wisdom (a new company co-founded by the two of us) attempts to address. Ordinary Wisdom poses a deceptively simple question: What happens when groups of humans try to give AI a conscience?

The Limits of Top-Down

The analogy to other management systems is instructive. ISO 9000 ensures quality processes; it does not guarantee that every individual worker cares about quality. ISO 14000 provides environmental management frameworks; it does not instill an environmental ethic in employees.

Maury Shenk, Ordinary Wisdom

The management system sits above and around behavior. What the management system cannot easily reach is the interior – the values, instincts, and judgments that shape action in the moments between formal review.

For AI, this interior problem is particularly acute. Large language models (LLMs) interact with millions of users in millions of contexts, the overwhelming majority of which receive no human review whatsoever. External governance structures can set policies, define processes, and audit outputs, but the ambient, unobserved mass of AI behavior is shaped by something else: the values and orientations baked into the model itself.

If those are misaligned, no management system will catch every instance of drift. The only durable solution is to build alignment from the inside out.

Seven Components, Four Gaps

Systems theory offers a useful diagnostic framework here. Any well-governed system can be described through seven components: input, process, output, feedforward, feedback, monitoring, and governance.

Leading generative AI models – ChatGPT, Claude, Gemini and their kin – are formidably sophisticated input-process-output machines. They receive prompts, compute at extraordinary scale, and generate responses. The engineering that produces them is remarkable. But the four remaining control components – feedforward, feedback, monitoring, and governance – are where current AI remains underdeveloped.

Ordinary Wisdom is building technical infrastructure – using state-of-the-art machine learning techniques – around these four missing pieces, working from the inside out rather than the outside in.

Feedforward: The Great Books as Positive Alignment

In systems theory, feedforward – i.e. the control of a process using its anticipated effects – is anticipatory rather than reactive. It shapes conditions before problems arise, orienting the system toward desired behavior in advance. For AI alignment, the most powerful feedforward intervention is controlling what a model learns from – not merely at training time, but as an ongoing process of informational and philosophical grounding.

The Ordinary Wisdom approach to feedforward begins with “great books” competitions: open public contests in which human participants propose curated portfolios of significant texts – literary, philosophical, ethical, historical – that can be used to orient AI models toward human worldviews.

Results are already striking. Asked to respond to the trolley problem – the classic philosophical dilemma about sacrificing one person to save five – Ordinary Wisdom, emulating Jane Austen, produced a response of genuine seriousness:

"One is presented with a most terrible quandary, where action or inaction both carry grievous weight . . . the principle that guides us to what is most right, and most wise, and therefore must involve least suffering, suggests that the train should be diverted."

The point is not that Austen is an authoritative moral philosopher. It is that exposure to rich, representative human wisdom produces AI responses that are culturally grounded and ethically coherent in ways that purely technical training cannot. Jane Austen, it turns out, is a utilitarian. This is feedforward working as intended: shaping the interior disposition of the model before it ever encounters a user.

Crucially, this approach allows each organization to choose the books providing feedforward input. If you don’t want Jane the utilitarian, there are a panoply of other sources that you can choose, applying their wisdom using Ordinary Wisdom’s automated techniques.

Feedback: The Semantic Harness at the Edge of Competence

Feedback in systems theory is corrective: It takes information about outputs and uses it to adjust behavior. For AI, the territory where corrective feedback matters most is at the edge of competence – the zone where a model is operating at or beyond the boundaries of its reliable knowledge, generating confident-sounding outputs that may be subtly or seriously wrong.

The Ordinary Wisdom feedback mechanism is a semantic harness: a structured layer designed to catch precisely these boundary-zone failures. Category errors, violating intended constraints, value drift, epistemic overconfidence, and other characteristic failure modes of LLMs become detectable and correctable within the system rather than only by downstream human review. This is negative feedback in the technical sense – a corrective signal that keeps behavior within appropriate bounds – operating at the point of highest risk.

Again, the specific constraints of a semantic harness can be tailored to the organization that applies them. The lines that should not be crossed will differ, for example, between an aeronautical engineering company vs a university vs a social club.

Governance: Sub-Constitutions for the Real World

The governance layer is where systemic values become specific operational constraints. Anthropic has pioneered this concept with its Constitutional AI approach for Claude – a set of high-level principles that govern model behavior across all deployment contexts. Ordinary Wisdom builds on this foundation by developing what might be called sub-constitutions: targeted governance documents tailored to specific use cases, industries, and user communities.

A general-purpose AI deployed in clinical healthcare faces a different constellation of ethical obligations than the same model deployed in legal advice, financial planning, or creative writing. A single master constitution cannot adequately address this specificity without becoming either too abstract to be useful, or too prescriptive to permit legitimate variation. Sub-constitutions allow governance to be simultaneously principled and contextually intelligent – maintaining coherence with the broader ethical framework while meeting the genuine needs of each implementation.

Monitoring: Conscience as Architecture

The oldest test of character is behavioral: How does a person act when no one is watching? Ordinary Wisdom treats this not as a metaphor but as a design specification. Its monitoring approach leverages the above three techniques – feedforward, feedback and governance – to provide AI models with something functionally equivalent to a conscience or worldview.

The portfolio framework means that alignment is not merely a matter of suppressing flagged outputs. It is a positive orientation toward a coherent set of values that the model pursues consistently across contexts, whether or not any oversight mechanism is active. The model is not performing values for an evaluator. It has, in some meaningful sense, internalized them. This is the inside-out inversion of the ISO approach: Rather than an external management system auditing behavior, a tailored internal disposition shapes it.

What Matters Is What’s Inside

The combination of these four components – feedforward through curated wisdom, corrective feedback through semantic harness, contextual governance through sub-constitutions, monitoring through portfolio-aligned worldview – constitutes a genuinely novel approach to AI alignment. It does not replace the important top-down efforts that we began by describing. Those approaches are necessary, but they are not sufficient.

What even diligent auditing cannot reach is the interior of the system – the values and orientations that determine how AI behaves in the unobserved majority of its interactions. Risk managers know this quandary of lip-service versus living-risk too well.

Ordinary Wisdom is building the inside-out AI alignment complement to the outside-in infrastructure that already exists. This component is essential to allow organizations to take back control of the details of value-alignment of the AI systems on which they increasingly rely. The goal, in the end, is the same: AI that is trustworthy not merely because it is audited, but because it has, in the deepest sense available to a machine, been given a conscience.

Professor Michael Mainelli is Chairman of Z/Yen Group and Ordinary Wisdom, and President of the London Chamber of Commerce & Industry. A qualified accountant, securities professional, computer specialist, and management consultant, educated at Harvard University and Trinity College Dublin, he earned his PhD at the London School of Economics. Michael’s AI experience dates from large-scale neural networks in the 1970s to support vector machines in the 1990s. He is a prolific writer; his book co-authored with Ian Harris, The Price of Fish: A New Approach to Wicked Economics and Better Decisions, won the 2012 Independent Publisher Book Awards Finance, Investment & Economics Gold Prize. He was Lord Mayor of London 2023-2024.

Maury Shenk is an entrepreneur, lawyer, investor, and AI-focused innovator, who co-founded Ordinary Wisdom in 2025 with a mission to enable “AI with a worldview.” He is a dual-qualified US/UK lawyer and former managing partner of the London office of Steptoe & Johnson, where he remains an advisor. A graduate of Harvard College and Stanford Law School, he is founder and CEO of PlaylistBuilder, senior legal advisor for testing and certification company PeopleCert, and founder and managing director of Lily Innovation, through which he handles a portfolio of other investment and advisory activities.

View full post