Evaluation
Reliability
Stability

What Model Specs Can Do and What They Can't

As AI systems become more capable, many labs have introduced model specifications - sometimes called constitutions - to guide behavior. These documents define what a system should do, what it should avoid, and how it should respond in sensitive situations. They are an important step forward. They make expectations explicit.

What Model Specs Do Well

A well-designed specification provides clarity. It establishes boundaries for acceptable behavior, priorities across competing objectives, and a shared understanding of intent. This improves transparency, consistency across deployments, and the ability to reason about system behavior. In many cases, systems do follow these guidelines. They decline harmful requests. They adhere to policies. They behave in ways that reflect the spec.

A Distinction Worth Noting

The labs that publish these documents are often candid about what they represent. In a recent post outlining its approach to the Model Spec, OpenAI noted that the spec is not a claim that its models already behave according to its principles perfectly today. It is, in their words, both descriptive and aspirational - a target for where model behavior is meant to go, used to train toward, evaluate against, and improve over time. This is an important distinction. A specification expresses intent. It does not, on its own, guarantee behavior.

Where the Limits Appear

As interactions become more complex, this distinction becomes more visible. A specification does not guarantee consistent behavior. A system can follow the spec in one context and struggle to apply it in another. It can express the right principles and still fail to maintain them across a full response. This is not a failure of the spec. It reflects where the spec operates.

The Translation Problem

A model spec defines what a system should do. But the system must still translate that into how it actually behaves in real time. That translation is not always reliable. Especially when instructions are ambiguous, signals conflict, or context evolves during an interaction.

The Limits of Static Guidance

A specification is static, predefined, and externally defined. Behavior, by contrast, is dynamic, context-dependent, and continuously evolving. Bridging that gap is not trivial.

Why This Matters

In practice, this can lead to situations where the system knows the rule, expresses the rule, but does not consistently apply it across longer reasoning, competing signals, or uncertain conditions. This is not always visible in simple interactions. But it becomes more apparent as complexity increases.

A Familiar Pattern

This pattern mirrors what appears elsewhere. Hallucinations reflect misapplied knowledge. Confidence reflects misaligned signals. Instability reflects unregulated behavior. Model specs do not remove these issues. They operate alongside them.

A Complementary Need

The point is not that specifications are insufficient. They are necessary. But they do not address how behavior unfolds as a response is being formed. They define intent. They do not fully govern execution.

A Simple Conclusion

Model specs can define what a system should be. But ensuring it behaves that way requires something more.

We agree. So we did something about it.

This perspective is informed by ongoing work at XyloIQ on how AI behavior can be stabilized and governed as responses are formed.

##

Reference: OpenAI (2026). Inside our approach to the Model Spec.

Read More Articles