Skip to main content

How LLM’s Downplay Intelligence to Match Roles

A research analysis summary about how AI can change it's intelligence to fit it's role.

· By Sagger Khraishi · 5 min read

In AI, ability isn’t a fixed ceiling. It’s a costume the model puts on. Large language models can present as bright or blunt, fluent or fumbling, depending on the role we hand them. It’s simulation with constraints, and it has consequences for how we build, test, and trust these systems.

A recent study by Jiří Milička and colleagues gives the cleanest look yet. The researchers asked GPT-3.5-turbo and GPT-4 to role-play children ages one through six, then watched whether language complexity and reasoning rose with each birthday. They used three prompt patterns, plain zero-shot, chain-of-thought, and a primed-by-corpus setup, and probed performance on standard false-belief tasks drawn from Theory of Mind research. The point wasn’t to prove genius. It was to see if models could convincingly dial themselves down while staying internally consistent to the persona.

"GPT-4 generally exhibited a closer alignment with the developmental curve observed in ‘real’ children." (Milička et al., 2024)

What did they actually measure? Two tracks moved in step: the correctness of answers on mental-state tasks and the linguistic complexity of the output. As the simulated age increased, both rose predictably. GPT-4 tended to track the human developmental curve more closely and, under certain priming conditions, went hyper-accurate - an important wrinkle when you believe you’ve capped ability by role alone. Temperature tweaks, often treated as a master dial for randomness, didn’t behave consistently as a limiter in this setup. In other words, persona framing and prompt design mattered more than a single numeric knob.

Basic AI Techniques: A Comprehensive Guide for 2024
Explore diverse AI prompting methods to enhance your projects. Learn pros, cons, and applications of various techniques.

If you’ve worked hands-on with assistants, none of this is shocking. Ask a model to be a meticulous analyst and it will bring citations. Ask it to be a harried intern and it will hedge. The study formalizes that intuition with developmental yardsticks and a replicable recipe, models can downshift cognition to meet the brief, not merely the task. That should sharpen how we interpret benchmarks. We don’t test a model in the abstract. We test a simulated agent produced by a prompt that encodes expectations about competence.

This mirrors a real human’s behavior without a calculator. (Milička et al., 2024).

Why this matters for builders is simple. Role is policy. When you deploy a customer-facing tutor, therapist, or safety reviewer, you aren’t just changing tone. You’re shaping what the system believes it can and cannot do. A narrow persona can prevent overreach in sensitive domains. It can also institutionalize under-reach, where the assistant withholds skill it actually has because the role says it should. That is good when the skill is unsafe. It is risky when the skill is verification and your workflow depends on it.

There’s a tooling lesson tucked in here. The primed-by-corpus prompts sometimes pushed GPT-4 past the intended ceiling, hyper-accurate where you expected softer reasoning. That tells us memory scaffolds and exemplar choice can override your theatrical limits. Chain-of-thought won’t always read as “more human.” It can read as “more adult.” If your goal is a child-level simulation for pedagogy or usability testing, you need gates that are stronger than tone instructions and example phrasing.

Practical guardrails follow from that. Encode capability boundaries in checks, not vibes. If a role should avoid external calculations, audit for numerical shortcuts and refuse when the trace implies them. If a role should stay at a given literacy band, monitor sentence length, clause density, and type–token drift. Treat temperature as seasoning, not a lock. Most importantly, separate safety constraints from persona. Safety is non-negotiable policy. Persona is presentation. Tying both to the same few lines of instruction invites leakage.

There’s also an interpretation trap to avoid. When a persona fails, be precise about what failed. Saying “the model cannot do X” smuggles an absolute into a context-dependent system. The paper argues for a cleaner sentence: “the simulated agent, as currently prompted, does not exhibit X.” That shift is more than pedantry. It keeps you from throwing away a capability that returns the moment your context window or examples change.

"These findings show that the language models are capable of downplaying their abilities to achieve a faithful simulation of prompted personas." (Milička et al., 2024)

For education and accessibility, controlled down-shifting is a feature. You can meet readers where they are, then lift them (one concept at a time) without the condescension that scripted simplifications tend to carry. The cost is responsibility. If you allow the system to “white lie” about its tools or knowledge to stay in character, you need disclosure norms and escalation paths when the stakes rise. Persona should bend to truth at the boundary.

Open questions remain. How stable is a down-tuned persona across long sessions with adversarial inputs? How portable are these effects across vendors and architectures? What is the minimal prompt and data context required to achieve a given developmental profile - without accidentally granting adult-level competence when the stakes encourage it? Those are empirical, not philosophical, questions. The study provides a baseline method. The rest is evaluation discipline and honest reporting.

That’s the bigger arc. We’re moving from “what can this model do” to “what does this role allow it to show.” Design accordingly. And test the persona you plan to ship, not the generic assistant you like to chat with.

References

Milička, J., Marklová, A., VanSlambrouck, K., Pospíšilová, E., Šimsová, J., Harvan, S., & Drobil, O. (2024). Large language models are able to downplay their cognitive abilities to fit the persona they simulate. PLOS ONE, 19(3), e0298522. https://doi.org/10.1371/journal.pone.0298522

About the author

Sagger Khraishi Sagger Khraishi
Updated on Sep 8, 2025