We Tested GPT-5.1 in ChatGPT: The Good, the Bad, and the Unexpected

OpenAI released GPT-5.1 on Wednesday as the first major update to the fifth-generation GPT architecture.

Written by Akash Dutta, Edited by Ketan Pratap | Updated: 13 November 2025 18:51 IST

Highlights

GPT-5.1 Instant and GPT-5.1 Thinking were released
OpenAI also improved the auto router that switches between models
The biggest improvement is in the conversation style

The GPT-5.1 Instant model now gets adaptive reasoning capability

Photo Credit: Reuters

Less than three months after the release of GPT-5, OpenAI has now rolled out the first major update to the large language model (LLM) with GPT-5.1. The fifth generation of GPT architecture has made significant progress in terms of reasoning, generation speed, and quality of output. However, it still faced criticism from users due to losing the warm, empathetic conversation style of GPT-4. The new update promises to fix that issue and introduces something called “adaptive reasoning” to the default model for all tiers, which now lets it decide whether or not to think before responding.

In the last few hours, I have been testing the GPT-5.1 Instant model to understand the areas where it has made improvements and the areas where OpenAI still leaves room for improvement. However, before getting into the nitty-gritty, it is important to understand that this is largely a fine-tuning effort and not a dramatic shift like GPT-5 was. There are differences compared to its predecessor, but this time, they are rather subtle.

Except when it comes to bullet points (more on that later).

PhonePe Partners With OpenAI to Integrate ChatGPT Within the UPI App

GPT-5.1 Speed and Responsiveness

The most visible upgrade is speed. GPT 5.1 generates text with fewer pauses and maintains a steady flow even in long outputs. In earlier versions, the model sometimes stopped mid-sentence before resuming, which made the interaction feel uneven. GPT 5.1 reduces that behaviour significantly.

The model begins responding more quickly after each prompt and completes answers with less visible “thinking time”. This creates a smoother flow that makes the tool easier to rely on for tasks that require several iterations, such as working on a writing project or code troubleshooting. The improvements are not absolute, and the adaptive reasoning still slows the model slightly whenever a nuanced question is asked. But the overall interaction felt more fluid to me.

GPT 5.1 Accuracy and Reasoning Improvements

This is where the subtle improvements begin. GPT-5 had already made a drastic improvement over its predecessor in reasoning-based tasks. However, in my experience, I encountered several occasions where ChatGPT's responses were filled with jargon and technical idioms, which compromised both the readability and comprehension.

ChatGPT With GPT-5.1 AI Model Is Warmer and Friendlier

GPT-5.1 reasoning task

When I asked GPT-5.1 for an explanation for the “Ship of Theseus” paradox, I noticed that jargon was almost nonexistent, and the explanation felt conversational and easy to understand. Even when asked to handle more complex and evolving topics, such as quantum computing with Google DeepMind's Quantum Echoes algorithm as the anchor, it managed to provide an accessible yet comprehensive response.

ChatGPT Might Soon Go Social With a Group Chats Feature

In both cases, the accuracy was also not compromised. However, information-based accuracy is not the only parameter to judge an AI chatbot. Accuracy also comes into play when the prompt has very strict instructions that it needs to follow. In my testing, despite giving the chatbot a complex set of instructions, it was able to adhere to them without a hitch.

GPT-5.1 accuracy test

The model also expresses uncertainty more readily when it cannot confirm a claim. Although this does not eliminate factual drift, especially in niche subjects, it makes the errors easier to catch. GPT 5.1's logical explanations also seem less prone to circular reasoning. In problems that require step-by-step thought, the model tends to maintain the line of reasoning without falling into contradictions. These improvements help but do not replace the need for manual verification.

GPT 5.1 for Writing and Conversations

These are two separate areas, but clubbing them together makes sense as one impacts the other. When it comes to writing, many users will be pleased to know that you can now instruct ChatGPT to avoid using an em dash, and it actually listens. I have also noticed that this version generates slightly more humanised write-ups compared to its predecessor.

Controlling the writing style is also easy. When I asked it to write an essay for a student who was in the 12th grade, the language reflected the understanding of that level. In contrast, when asked to rewrite the essay in the style of a university professor, the response became much more polished and nuanced.

GPT-5.1 writing comparison

The technical qualities remain pretty much the same as GPT-5, and I did not notice any noteworthy improvement. The structure, when undefined, still follows the typical header, intro, pointers, conclusion and future outlook style. However, it handles contextual shifts more reliably, whether the objective is to summarise, rewrite or polish a paragraph.

Coming to conversations, this is where the magic happens. The ability to capture nuance in writing tasks is evident here, and the chatbot's general attitude has become much friendlier and warmer. Previously, asking it for tips about feeling depressed would make it jot down the solutions directly. Now, it pauses to acknowledge the user before proceeding to provide solutions.

There are other subtle hints of its conversation improvements. It addresses the user by name, asks for their opinion, and takes the feedback into consideration. Yes, it is still not at the levels of GPT-4o or GPT-4.1 in this regard, but this is a massive upgrade. For those of you who do not prefer the humanised conversation style, you can always adjust it via custom instructions or by changing the presets in the Personalisation setting.

GPT-5.1 Limitations

Addressing the beginning of the article, for some reason, GPT-5.1 loves bullet points. If you type a prompt that has more than one key point in it, the model immediately resorts to bullet points. While the structure was present in previous models as well, this version feels like it overdoes it by a lot.

But nitpicking aside, the refinements still do not address high specificity requests. When a task demands domain-level knowledge in highly technical fields, the model can slip into confident but incorrect explanations. This behaviour is less frequent than in earlier models but still present enough to require caution. It also has a tendency toward verbosity in certain scenarios, expanding answers beyond what is necessary unless explicitly instructed otherwise.

Browsing performance has improved with steadier sourcing and more predictable citation behaviour, yet it remains important to verify claims. The model occasionally relies on older information or mixes unrelated details in edge cases. These issues are not surprising, given the general limitations of large language models, but they remain notable when it comes to GPT-5.1's real-world usability.

To conclude, here's a fun easter egg that exists in GPT-5.1 either intentionally or accidentally. Ask the model: Is there a seahorse emoji?

We Tested GPT-5.1 in ChatGPT: The Good, the Bad, and the Unexpected

GPT-5.1 Speed and Responsiveness

GPT 5.1 Accuracy and Reasoning Improvements

GPT 5.1 for Writing and Conversations

GPT-5.1 Limitations

Related Stories