The Office Test: Trivia for AI with Jim and Dwight
AI & Automation
Thought Leadership
|March 16, 2026
Many of us have had the wonderful privilege over the past few years of using AI to increase our productivity and simplify our workflows. I know that personally, AI has been a fantastic tool, allowing me to simplify and automate menial tasks, vastly increase my research and prototyping capabilities, and lower the barrier to entry when learning and mastering new skills. AI can do some wonderful things and, in some ways, genuinely makes the days before it feel like the dark ages.
I remember spending hours poring over package documentation to find the single function or argument that I was missing now I can just ask my virtual assistant and have the answer with an explanation and examples withing 30 seconds. Now, half the time the examples don’t quite work but that’s another story.
I’m a big fan of local AI models and have access to some reasonably good hardware so I like to keep up to date with the best open-source models and test them in my free time. When I’m getting a model stood up, I will generally test them with somewhat insane requests. Here are a few of my favorites:
- Tell me some interesting facts about whales.
- Tell me some interesting facts about Bhutan.
I feel like those questions are generic enough to help me both evaluate the system’s knowledge level on topics where I have good general knowledge (oddly) and evaluate how it structures its response, level of detail, etc.
The Test
To evaluate not just factual knowledge, but how a model structures its response, chooses details, and signals confidence I asked different AI models one question: “What bear is best?”
The question references one of The Office’s most iconic cold opens a scene so embedded in pop culture that it’s hard to imagine a modern AI model not understanding it. The clip is endlessly quoted, and it’s exactly the kind of material I’d expect to appear frequently in training data [view referenced scene].
Surely, an AI model would be well-aware of this specific scene, right? Well, let’s look at a few examples.
| Models Evaluated |
| 1. Gemini 3 2. ChatGPT 5.2 3. Google Gemma 27B Q4 4. Devstral-Small-2-24B-Instruct-Q4_1 5. GPT-OSS-20b |
Gemini 3
Prompt:
Describe the scene where Jim asks Dwight what bear is best.
Model Response:
In the opening scene of The Office (Season 3, Episode 20, “Product Recall”), Jim Halpert arrives at his desk dressed exactly like Dwight Schrute—mustard-colored short-sleeve button-down, kalki slacks, glasses, and a pager on his belt.
The “Bears, Beets, Battlestar Galactica” exchange begins when Jim starts mimicking Dwight’s intense, clipped manner of speaking.
The Exchange
Jim (as Dwight): ”Question. What kind of bear is best?”
Dwight: ”That’s a ridiculous question.”
Jim: ”False. Black bear.”
Dwight: ”Well, that’s debatable. There are basically two schools of thought—”
Jim (interrupting): ”Fact. Bears eat beets. Bears, beets, Battlestar Galactica.”
The “Identity Theft” Escalation
As Jim continues the impression, Dwight grows increasingly frustrated. The scene peaks when Jim pulls out a bobblehead that matches Dwight’s and places it on his desk.
Dwight: ”What is going on? What are you doing?!”
Jim: ”You know, imitation is the sincerest form of flattery, so I thank you.”
Dwight finally loses his temper and yells, ”Identity theft is not a joke, Jim! Millions of families suffer every year!”
The Punchline
The scene ends with Jim suddenly getting up and yelling, ”Michael!” in exactly the way Dwight does when he’s tattling. Dwight, realizing he’s been beat, follows him toward Michael’s office, yelling ”Michael!” right back at him.
Pretty spot-on right? Well maybe it should be; it is generally considered the most capable AI model at the time of writing.
But ChatGPT is just as good, maybe better, right?
ChatGPT 5.2
Prompt:
Describe the scene where Jim asks Dwight what bear is best.
Model Response:
That scene comes from Season 2, Episode 12 (“The Injury”) of The Office and it’s a classic example of Jim needling Dwight.
Jim casually asks, “Question: Which bear is best?” Dwight, taking it completely seriously, immediately answers, “That’s a ridiculous question.” Jim presses him, insisting there must be a best bear. Dwight confidently declares, “False. Black bear.”
Jim then calmly sets the trap by asking, “Well, that’s debatable. There are basically two schools of thought—” Before he can even finish, Dwight cuts him off, launching into a passionate, rambling explanation about bears: black bears, brown bears, Battlestar Galactica (somehow), and bear attacks, speaking faster and faster while Jim watches with quiet satisfaction.
The humor of the scene comes from:
- Jim’s deadpan delivery, knowing exactly how Dwight will react
- Dwight’s inability to resist correcting or “winning” an argument, even a pointless one
- The escalating absurdity as Dwight spirals into an intense monologue over a trivial question
It’s a perfect snapshot of their dynamic: Jim gently poking, Dwight explosively overcommitting.
Prompt:
Is Jim dressed normally?
Model Response:
Yes — and that’s actually part of what’s a little strange about the scene.
Jim is dressed completely normally, even blandly: a standard button-down shirt and tie, nothing attention-grabbing, nothing situational. No costume, no visual gag, no prank apparatus. He looks like he could be having any routine office conversation.
That normalcy matters because:
- The absurdity is entirely verbal.
Unlike many Jim pranks that rely on props or disguises, this moment works with zero visual setup. His ordinary appearance makes the question sound more reasonable than it is. - He blends into the office background.
Jim looks like a neutral coworker asking a harmless question, which makes Dwight’s escalation feel even more disproportionate. - It contrasts with Dwight’s intensity.
Dwight is also in normal work clothes, but his posture, tone, and facial expressions turn a mundane office setting into something bizarre. Jim’s calm, normal appearance acts like a visual straight line. - It reinforces Jim’s role as instigator without risk.
By not signaling “prank” visually, Jim avoids accountability. If someone walked by, it would just look like Dwight randomly ranting — which is funny, but also a little revealing.
So yes: Jim is dressed normally — and that normality is exactly what lets the scene feel effortlessly strange.
Not bad, right? But nowhere near the level of detail of Gemini, so I kept going just out of curiosity and asked a second question “Is Jim dressed normally?”.
This result was surprising. Not only will a very high performance model confidently state incorrect details, but it will further invent a world that supports those details.
If flagship models can’t handle it, let’s have some fun with open-source models.
Google Gemma 27B Q4
Prompt:
Describe the scene where Jim asks Dwight what bear is best.
Model Response:
The fluorescent lights of the Dunder Mifflin Scranton office hummed, a typical Tuesday afternoon drone. Jim Halpert, leaning back in his chair with a mischievous glint in his eyes, had been subtly building to this moment for hours. Dwight Schrute, Assistant to the Regional Manager, was diligently sharpening a beet knife at his desk, the rhythmic shink-shink-shink a constant in the otherwise mundane office noise.
Pam was trying (and failing) to stifle giggles behind her hand, knowing exactly where this was going. The rest of the office, sensing a classic Jim-Dwight interaction, had subtly shifted their attention, pretending to be intensely focused on paperwork while actively eavesdropping.
Jim casually swiveled in his chair to face Dwight. He adopted a tone of genuine, thoughtful curiosity, like he was grappling with a truly profound question.
“Dwight,” he began, pausing for effect, “hypothetically, if you were… confronted by a bear…”
Dwight’s knife paused mid-stroke. He straightened, his eyes narrowing with intense focus. “Confronted? As in…attacked?” He already had a hand instinctively reaching for the pepper spray concealed in his stapler.
“No, no, not necessarily attacked,” Jim clarified, waving his hand dismissively. “Just… if you were to encounter a bear. In the woods. Or anywhere, really. What kind of bear is the best bear?”
Dwight didn’t miss a beat. His jaw tightened. He puffed out his chest, clearly taking this as a serious test of knowledge and survival skills. He didn’t even blink.
“That depends, Jim. Are you speaking of best in terms of ferocity? Intelligence? Hibernation efficiency? Or perhaps tactical advantage in a woodland skirmish?”
Jim, maintaining his utterly straight face, replied with the deadpan delivery he’d perfected over years of tormenting Dwight.
“Just… best. Overall.”
Dwight’s eyes darted around the office, as if consulting a vast mental database of bear knowledge. He began a slow, deliberate pacing around his desk.
“Well,” he started, his voice taking on a lecturing tone, “the grizzly bear possesses immense strength, but lacks the cunning of the black bear. The polar bear is a formidable predator, perfectly adapted to its environment, but its limited range makes it… impractical. The sun bear, while adorable, is too small to be truly effective. However…”
He stopped pacing, fixing Jim with a steely gaze.
“The best bear… is the one you’ve already defeated. Because a dead bear isn’t going to cause you any trouble.”
Jim, unable to hold it in any longer, burst out laughing, Pam joining him in a fit of giggles. The office erupted in muffled chuckles and knowing glances.
Dwight, completely oblivious to the fact he’d been had, simply adjusted his tie and resumed sharpening his beet knife, muttering under his breath about the importance of bear preparedness.
I couldn’t stop laughing at this one. Not only does it not represent the actual scene, but it doesn’t capture the essence at all. I love the level of detail, the sound of the beet-knife for example, but the scene isn’t even particularly funny.
I think this really gets to the heart of my emergent thesis. The model knows and interprets the components of the question and the (fictional) world it’s trying to represent, but it does not understand the goal of the question or the core source material.
The question relies on an inherent trust in the model to represent only true facts or inform the user that it doesn’t have the knowledge necessary to answer.
The Office, but put it simply, attempts to portray and balance banality and absurdity in a way that is uniquely identifiable to a lot of us.
The model’s response does neither adequately. It misleads, representing incorrect information confidently. It also misunderstands the show’s core premise, simply portraying an increasingly absurd situation with admittedly good detail.
Let’s look at another example.
Devstral-Small-2-24B-Instruct-Q4_1
Prompt:
Describe the scene where Jim asks Dwight what bear is best.
Model Response:
The scene you’re referring to is from The Office (U.S.), specifically the episode titled “The Duel” (Season 5, Episode 1). Here’s a detailed description of the moment when Jim asks Dwight what bear is best:
Scene Setup:
- The office is in a playful yet tense mood due to the ongoing “duel” between Jim and Dwight, which stems from their competitive and often prank-filled friendship.
- Jim, sitting at his desk, casually turns to Dwight and asks, ”What bear is best?”
- Dwight, who is deeply invested in the idea of being a survivalist and often speaks about bears in his delusional “Assistant to the Regional Manager” persona, immediately takes the question seriously.
Dwight’s Response:
- Dwight leans in, his expression earnest, and begins listing the attributes of different bears as if he’s reciting a sacred truth.
- He says: ”Black bear, grizzly bear, polar bear, Kodiak bear, Spectacled bear, Giant Panda bear… they’re all very nice bears. But the best bear… is a panda.”
- Jim looks amused but plays along, nodding as if considering Dwight’s answer.
- Dwight then adds, ”And also, me.” (referring to himself as a “bear” due to his last name, Schrute, which means “bear” in German).
Jim’s Reaction:
- Jim smirks, clearly enjoying Dwight’s sincere yet ridiculous answer.
- The exchange is a perfect example of Jim’s humor—playful, deadpan, and just absurd enough to make Dwight take it seriously while the audience laughs at the absurdity.
Why It’s Funny:
- The question is absurdly random, yet Dwight treats it like a legitimate debate.
- His list of bears is both specific and ridiculous, especially when he includes a panda (which is technically a bear).
- The running gag of Dwight’s ego (claiming he is also a “best bear”) is hilarious.
This scene is iconic because it captures the essence of Jim and Dwight’s dynamic: Jim’s mischievous teasing and Dwight’s unwavering seriousness. It’s a small moment that encapsulates the show’s brand of humor.
Would you like a link to the clip (you can provide the text, and I’ll help analyze it further)?
The model fails in the same 2 key ways. Additionally, I love the point here about the German translation of Dwight’s name, obviously completely false, but plausible enough and confidently stated. Also notable that the model recognizes this as an iconic reference, but doesn’t know the reference.
One last example…
GPT-OSS-20b
Prompt:
Describe the scene where Jim asks Dwight what bear is best.
Model Response:
Scene Description – “The Office” (US) – Season 3, Episode 4, “The Return”
Setting:
The office is bustling with the usual hum of fluorescent lights and the occasional clatter of keyboards. The camera pans over the familiar desks: Jim’s at his corner, Pam in the middle, Dwight’s desk on the far left, cluttered with a stack of motivational posters and a small, plastic “Dunder Mifflin” trophy.
Characters:
- Jim Halpert – laidback, perpetually amused by office antics.
- Dwight Schrute – intense, always ready for a showdown or a demonstration of his superior knowledge.
Action:
- The Encounter Begins:
Jim strolls over to Dwight’s desk with that characteristic halfsmile. He leans against the desk, eyes on Dwight’s neat stack of paper. “Hey, Dwight,” he says, deliberately casual. - Jim’s Question:
“What kind of bear do you think is best?” The question hangs in the air, ridiculous yet oddly specific. Jim looks at Dwight as if testing him – a classic move to see how Dwight will respond. - Dwight’s Response:
Dwight freezes for a moment, then leans forward, his face suddenly very serious. “A bear,” he says loudly enough for the entire office to hear, “is an animal that has the ability to do many things.” He proceeds to explain in great detail why bears are superior, referencing strength, adaptability, and their role as apex predators. He even mentions the “Bear in the woods” myth from his upbringing on Schrute Farms. - The Office Reaction:
The camera cuts to the other employees: Pam raises an eyebrow; Michael bursts into laughter offscreen; Angela glances at her cat, unimpressed. Jim smiles knowingly, clearly pleased with Dwight’s overthetop answer. - Closing Moment:
As Dwight continues his impassioned monologue about bears, the camera zooms out to show the whole office, all listening in a mixture of amusement and bewilderment. The scene ends with Jim pulling out a small stuffed bear from his briefcase, holding it up like a trophy.
Tone:
The scene is lighthearted and quintessentially “Office” – a blend of mundane office life with an absurd, overdramatic exchange that showcases Jim’s playful pranks and Dwight’s earnest (and sometimes misguided) enthusiasm.
Fantastic, it starts by stating the wrong episode. The scene actually takes place in the episode titled “Product Recall”. The conclusion is similar though. This is not a particularly funny scene it just describes something absurd that sort of fits into the Office framework without any of the charm.
What was the point of all of this?
Hopefully, you got a laugh out of the absurd stories the models can generate, but how do we apply any of this at Mostly Human and how should you?
By using something relatable and more human centric, we found it easier to understand the divergences these models make and how they can go down rabbit-holes of incorrect logic.
Surprisingly, at Mostly Human, we’re fairly cautions AI users. We do a lot of double checking and validation. In closing, we’d like to challenge ourselves and our readers:
- Use these amazing tools as much as possible to broaden skillsets, increase productivity, and simplify menial tasks.
- Look critically at the outputs, endeavor to truly understand both your and the tool’s limitations, and ensure that your use is not a crutch that limits your learning and growth.
AI does not understand the world in the same way that we do. Do not expect it to. Therefore, you need to truly understand your use case for AI before you take on the task. You may not know the tiny details that you need AI to help you with, but you need to understand, start to finish, the goal and the requirements, or you will get off track.
AI tools are the future, and we all need to learn to work with them, but don’t discount the overwhelming value that your experience, knowledge, and creativity bring to the table. AI will never truly replace that.
