OpenAI Unveils Two New Custom AI Reasoning Models

Girish Vidhani

OpenAI has launched two new open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b, available for free on Hugging Face. This release is a significant shift for OpenAI, which has historically favored proprietary AI models. 

The move marks the company’s return to open-sourcing its AI technology, a strategy it abandoned after the release of GPT-2 over five years ago. The company aims to make its models more accessible to a wider range of developers, thus democratizing AI development and fostering innovation

Different Sizes and Capabilities – gpt-oss-120b vs. gpt-oss-20b

The two models released differ in size and capability:

  • gpt-oss-120b: The larger of the two, with 120 billion parameters, this model can run on a single NVIDIA GPU.
  • gpt-oss-20b: A more lightweight model with 20 billion parameters, capable of running on consumer-grade laptops with 16GB of memory.

This tiered release allows a broader range of developers, from those using high-performance GPUs to those with standard laptops, to leverage the models in their projects. While the larger model offers superior performance, the smaller model is more accessible for users with limited resources.

Licensing: Open Access for Developers

Both models are released under the Apache 2.0 license, a highly permissive open-source license. This allows developers and enterprises to use, modify, and even monetize the models without seeking permission from OpenAI. 

This decision comes in response to growing competition from other open-source AI projects, especially those from Chinese labs like DeepSeek, Alibaba’s Qwen, and Moonshot AI. By opening its models to the public, OpenAI aims to regain its position as a leader in the open-source AI space while encouraging the development of AI technology in line with democratic values.

Performance Benchmarks: Strong, Yet Imperfect

OpenAI has shared the performance of its models across various benchmarks:

  • Codeforces: A competitive coding platform, where gpt-oss-120b scored 2622 and gpt-oss-20b scored 2516. Both models outperformed DeepSeek’s models but underperformed OpenAI’s proprietary o3 and o4-mini models.
  • Humanity’s Last Exam (HLE): A challenging test of crowdsourced questions, where gpt-oss-120b scored 19% and gpt-oss-20b scored 17.3%. These results were lower than OpenAI’s proprietary models but still better than other open models.

However, one of the significant drawbacks of these models is their hallucination rate—the frequency with which they generate incorrect or fabricated information. 

In tests like OpenAI’s PersonQA, gpt-oss-120b and gpt-oss-20b exhibited hallucinations in 49% and 53% of responses, respectively.

In comparison, OpenAI’s other models, such as o1 and o4-mini, show much lower hallucination rates. OpenAI notes that this is expected for smaller models, as they have less world knowledge.

Training Process: Efficiency and Reinforcement Learning

OpenAI used a mixture-of-experts (MoE) approach in training the models, where only a subset of the total parameters are activated for each task. In the case of gpt-oss-120b, which has 117 billion parameters, only 5.1 billion are activated per token. This helps the model perform efficiently without using excessive computational resources.

The models were also trained using reinforcement learning (RL), a post-training process where AI models are trained in simulated environments to distinguish right from wrong. This process has been critical in improving the reasoning capabilities of OpenAI’s proprietary models, and its inclusion in the open models ensures they also benefit from advanced training techniques.

Text-Only Limitation: No Multimodal Capabilities

Despite their capabilities, both models are text-only, which means they cannot process or generate images, audio, or other multimodal data. This limits their use compared to OpenAI’s other models, like DALL·E (for images) and Whisper (for audio). 

However, these text-based models can still be used for various applications, including web searches, data analysis, and code execution. Their primary strength lies in their ability to generate and understand text, which is still a crucial aspect of many AI tasks.

Safety and Ethical Considerations: Preventing Misuse

OpenAI has implemented several safeguards to ensure that the release of these models does not lead to harmful uses, such as cyberattacks or the creation of dangerous technologies. 

While the models were found to carry some risk, particularly in areas like biological research, OpenAI did not find evidence that they could be used to create high-risk threats. This cautious approach is part of OpenAI’s broader strategy to balance innovation with responsibility.

The company continues to monitor the use of its open models, ensuring that they are used ethically and safely. OpenAI has acknowledged that the potential for misuse will always exist with powerful AI models but remains committed to ensuring that these risks are minimized.

Global Impact and Future Developments

The release of these models is a significant step forward in the global race for AI supremacy. By open-sourcing these powerful tools, OpenAI is allowing developers worldwide to build on its technology and contribute to AI research and development. This move could accelerate advancements in various industries, including healthcare, finance, entertainment, and technology.

Looking ahead, developers are eagerly anticipating the release of future open models from OpenAI and competitors. OpenAI is likely to continue refining the models, addressing their hallucination issues, and enhancing their multimodal capabilities. As the competition in the AI space intensifies, the success of these models could determine the trajectory of open-source AI development in the coming years.

Girish is an engineer at heart and a wordsmith by craft. He believes in the power of well-crafted content that educates, inspires, and empowers action. With his innate passion for technology, he loves simplifying complex concepts into digestible pieces, making the digital world accessible to everyone.

DETAILED INDUSTRY GUIDES

https://www.openxcell.com/artificial-intelligence/

Artificial Intelligence - A Full Conceptual Breakdown

Get a complete understanding of artificial intelligence. Its types, development processes, industry applications and how to ensure ethical usage of this complicated technology in the currently evolving digital scenario.

https://www.openxcell.com/software-development/

Software Development - Step by step guide for 2024 and beyond

Learn everything about Software Development, its types, methodologies, process outsourcing with our complete guide to software development.

https://www.openxcell.com/mobile-app-development/

Mobile App Development - Step by step guide for 2024 and beyond

Building your perfect app requires planning and effort. This guide is a compilation of best mobile app development resources across the web.

https://www.openxcell.com/devops/

DevOps - A complete roadmap for software transformation

What is DevOps? A combination of cultural philosophy, practices, and tools that integrate and automate between software development and the IT operations team.

GET QUOTE

MORE WRITE-UPS

Pick the one that matches your criteria, repository size, and vibe as well. It is late, the team is staring at a stubborn bug buried somewhere under thousands of lines…

Read more...
Augment Code vs Cursor

Imagine it’s 3:00 AM, and you have been chasing a memory leak for five hours, but your last three cups of coffee have failed you. In 2026, we don’t just…

Read more...
Claude vs ChatGPT

The way developers build software is changing, and the best vibe coding tools are responsible for this. Instead of the traditional method of writing every line, vibe coding tools let…

Read more...
Best Vibe Coding Tools

Ready to move forward?

Contact us today to learn more about our AI solutions and start your journey towards enhanced efficiency and growth

footer image-img