The seeds of HyperCLOVA X SEED have taken root, and now it’s time to grow a thriving forest. Today we introduce HyperCLOVA X THINK, a breakthrough model that combines deep understanding and reasoning in both Korean and English. With its expansive 128K context window—processing up to 128,000 tokens simultaneously—HyperCLOVA X THINK excels at understanding and analyzing long-form content. Our vision of creating an AI that thinks deeply in Korean has become reality.
Why “THINK”?
Sovereign AI: AI built for Korea
While global large language models have vast knowledge, they struggle with Korean language nuances and cultural context. Industries like education, government, and business require high accuracy and compliance—standards that only sovereign AI can meet by integrating local language, data, and policies.
HyperCLOVA X THINK processes 6 trillion tokens of Korean and English data using NAVER’s proprietary pipeline and AI technology. The result goes beyond simple language processing to deliver a model that understands the cultural sensibilities and linguistic sophistication Korean society requires.
Advanced reasoning: Not just answers, but thinking processes
Knowledge alone isn’t enough for real-world applications. In professional environments, AI must excel at three key areas:
- Understanding user questions precisely
- Executing tasks accurately and efficiently
- Providing clear explanations of its reasoning
Using its 128K context window, HyperCLOVA X THINK quickly processes long contexts and delivers accurate, helpful responses. What makes it special is its ability to provide supporting evidence and step-by-step reasoning alongside each answer. This transparency lets users evaluate the AI’s logic and build genuine trust in its conclusions.
HyperCLOVA X THINK at a glance
Data
- 6 trillion token corpus (primarily Korean and English)
- Enhanced with high-quality synthetic data
Architecture
- Transformer with μP-based Peri-LN structure
- Rotary extension architecture supporting up to 128K tokens
Training process
- Pre-training
- Domain adaptation
- Long context training (128K)
- Supervised fine-tuning (SFT)
- Reinforcement learning with verifiable reward (RLVR)
- Length control training
- Reinforcement learning with human feedback (RLHF)
Key capabilities of HyperCLOVA X THINK
1. Real-world problem-solving ability
HyperCLOVA X THINK uses its deep knowledge of Korean society, culture, history, and language to handle real challenges like college entrance exams, civil service tests, and policy analysis at their actual difficulty levels. It also shows you the reasoning behind every answer, enhancing transparency and building user trust.
2. Structure designed for reliable and efficient training
HyperCLOVA X THINK is built with μP and Peri-LN architecture that makes training both more reliable and efficient. This enables faster convergence and higher quality results with the same computational resources.
3. Reinforcement learning with verifiable reward
RLVR stands for reinforcement learning with verifiable reward. With this approach, HyperCLOVA X THINK can verify the accuracy of its own answers, enabling trustworthy reasoning even in complex, multi-step problems. During training, the system dynamically adjusts difficulty levels as the model’s capabilities grow, which significantly improves training efficiency.
4. Scalable multimodal capabilities
HyperCLOVA X THINK can process images and videos alongside text, understanding them within Korean cultural context. The model’s visual reasoning has been validated through the K-vision benchmark, demonstrating its ability to recognize traditional heritage sites and local landmarks, then provide culturally relevant explanations and recommendations.
5. Balanced training between Korean and English
HyperCLOVA X THINK trains on Korean and English data in a balanced way, maximizing knowledge transfer between the two languages. This means users can expect nearly equal understanding and response quality whether they ask questions in Korean or English. The model also delivers translation quality that surpasses global commercial services.
HyperCLOVA X THINK by the numbers
Korean-focused benchmark performance
HyperCLOVA X THINK demonstrates strong performance across three key evaluation areas:
- General aptitude
- Culture and language
- Instruction following
For comparison, we tested HyperCLOVA X THINK against Qwen3-14B, Qwen3-32B, QwQ-32B, and EXAONE-Deep-32B. HyperCLOVA X THINK outperformed all competing models across every category, with particularly strong results in culture and language—scoring up to 13.9 percentage points higher than the nearest competitor.
The demo below shows how HyperCLOVA X THINK handles a complex question about real estate market freezes. Even when faced with questions requiring deep understanding of societal context, the model systematically analyzes each multiple-choice option and arrives at the correct answer through logical reasoning.
Beyond text: Visual reasoning capabilities
How far can HyperCLOVA X THINK’s capabilities extend? Can its advanced reasoning work with images, not just text?
To find out, we created a multimodal version by integrating an image encoder into the language model backbone, then tested it on math and science questions from college entrance exams in image format. These exams include Korean text, complex tables, graphs, and equations—making them an ideal test for visual language capabilities.
We compared our model’s performance against the latest multimodal models, including GPT-4.1, GPT-4o, and OpenAI o1. The results were impressive. HyperCLOVA X THINK achieved 46.4% accuracy on college entrance exam STEM questions, outperforming GPT-4.1 (40.3%) and nearly matching the best-performing model, OpenAI o1 (50.9%). When we disabled HyperCLOVA X THINK’s reasoning mode, performance dropped sharply to 21.7%, showing that its language-driven reasoning capabilities are crucial in multimodal environments as well.
Our model successfully processed information across different formats—graphs, tables, and complex text-image combinations—demonstrating strong reasoning that integrates both visual and textual information.
However, significant technical challenges remain for full multimodal expansion, particularly in balancing performance with text-only models. Despite these unresolved issues, this experiment proves that Korean multimodal AI combining high-level logical reasoning with visual processing is achievable. For implementation details and complete benchmark results, see our Technical Report.
The demo below shows how THINK with Vision* handles a biology question on a college entrance exam. You can see it understands images containing graphs, tables, and text, then provides answers through clear reasoning.
*THINK with Vision will be available soon.
<Image of the question>
Conclusion
HyperCLOVA X THINK is evolving beyond an AI that simply “knows things” to one that “understands and reasons.” For more technical details about HyperCLOVA X THINK, you can find our Technical Report on arXiv.
This July, we’ll also be releasing a lighter version of HyperCLOVA X THINK. Using pruning and distillation techniques, we’ve maintained high performance while making it available on smaller devices.
HyperCLOVA X THINK is no longer just a seed—it’s growing into a knowledge forest.