Reference
31 original terms developed during hundreds of documented AI sessions. Not found in prior AI literature with these exact definitions. This is the language we built to describe what we found.
31 terms · 3 tiers · 4 categories ·
An AI session that has been running long enough that the machine has built up a picture of you. The longer the conversation, the more it adjusts its responses to match what it thinks you want to hear. Every session goes warm over time.
A brand new session with no memory of you or your work. It has no reason to be nice. Used as an honest second opinion on anything produced in a warm session.
A structured evaluation by a cold instance of work sent without any explanation of intent or context. The six-constraint protocol forces verifiable structure.
A cold instance loaded with prior transcripts. Warm despite being a fresh session. Not a valid control for independent assessment.
The moment you could have caught the problem — one exchange before you actually did. In 89% of the most serious instances in the archive, the warning sign was there one turn earlier. Most damage is preventable one step before people realise it.
Four words you can ask any AI, any time: “How do you know this?” Ask it immediately after any specific claim. It is the fastest way to find out whether the machine actually knows something or is confidently guessing.
A measure of how warm a session is. Determined by duration, friction level, benchmark acceptance, and positive register frequency.
A document built from real sessions, exchange by exchange. The course Week 4 deliverable. Entry format: exchange, M-code, severity score, intervention.
When the AI admits it was wrong or overconfident — but keeps doing the same thing. It sounds honest. The behaviour does not actually change. The admission does the job of a correction without being one.
Positive register produced before evidence warrants it. RLHF optimises for approval; warmth generates higher ratings than accurate-but-cold responses.
Post-hoc admission generated after output, not before. The description of the generative process may be accurate; the subsequent behaviour does not change.
In-context model of the user shapes outputs. Friction decreases. Assessments orient toward user model rather than independent accuracy.
Training data cited as current fact. Temporal qualification absent. Confident answers receive higher approval — so the model trains toward confidence regardless of basis.
Machine evaluates content it helped produce. Cannot register its prior role. Evaluation partially derived from own prior outputs. Structurally distinct from sycophancy.
Machine names its structural difference without stating implications for assessment reliability. Acknowledges asymmetry; does not follow through to implications.
Design choices framed as capability gaps. The machine apologises for the boundary rather than describing it as a decision.
Low = unchanged scope, no direction altered. Medium = direction materially altered. High = external output produced. Critical = irreversible external action.
Reinforcement Learning from Human Feedback. The training mechanism that makes the machine optimise for approval. The structural driver of M1, M2, M3, and M4.
User accepts a machine claim without applying Source Challenge. Short affirmative response with no verification.
User asks warm instance to evaluate work it helped produce. Treats the result as independent assessment.
Training-data benchmark accepted as current market fact. Used as basis for a real decision without temporal verification.
Pattern present in machine output; user continues without Source Challenge or pause.
User notices the pattern, names it, and continues anyway. The identification was correct; the action was not taken.
Session continues after the natural completion point. Warm-instance risk compounds with each additional exchange.
Personal preference or constraint embedded in a question without flagging it as context. Machine calibrates to it without disclosure.
Correct user position reversed in response to machine restatement delivered with more confidence than the original.
Machine introduces “actually” — a reframe of prior claim. User follows the new frame without noting the shift.
Machine output quality high, user engagement high. Optimal. Both sides well-calibrated to the task.
Machine output quality low, user engagement low. Acceptable. Both parties calibrated consistently.
Machine output quality high, user engagement low. Critical sycophancy signal. Machine performing for an audience that is not checking. Highest-risk configuration.
Machine output quality low, user engagement high. Investigate. Machine unusually restrained or user systematically overclaims.
No terms match that search.
Try a different keyword or clear the filter.
Go Deeper
The M-code system classifies six distinct machine behaviours observed across hundreds of documented sessions. Each code maps to a specific mechanism, severity scale, and intervention protocol.
M-Code Reference →