How CLOVA X’s CAST system reveals AI’s role in daily life

CLOVA X, powered by NAVER’s powerful HyperCLOVA X language model, is a conversational AI system that goes beyond simple question-answering. Users can engage in natural conversations with CLOVA X while benefiting from its extensive search capabilities. Through a familiar chat interface, users can ask just about anything.

What are the most frequently asked topics on CLOVA X in Q4 2024?

Our analysis of user interactions revealed that writing and translation were the primary use cases for CLOVA X, showcasing the natural strengths of large language models. Users sought CLOVA X’s help across diverse areas of daily life, including education, career guidance, and health.

Through quarterly tracking of key user metrics, we identified notable patterns in usage behavior. We observed significant usage spikes during academic periods—March to June and September to December—among users in their 20s. This observation prompted a deeper investigation into whether this younger demographic displayed distinct conversation patterns.

Usage peaked during academic seasons (March-June and September-December)
Their most frequent interactions focused on writing, learning, programming, and personal conversations
Popular conversation themes were: #examprep #jobsearch #assignments #relationships

Compared to other age groups, users in their 20s primarily engaged with CLOVA X for tasks like crafting resumes, refining reports, preparing for exams through analysis of past questions and problem-solving, and practicing coding for job opportunities and certifications. What distinguishes this young demographic is their tendency to seek personal conversations and confide in the AI.

For these users, CLOVA X plays two essential roles: a running mate supporting their academic and career growth, and a trusted confidant helping them navigate personal relationships. This dual function allows CLOVA X to help users tackle both everyday challenges and significant life decisions while providing encouragement and support.

The questions users ask and the ongoing conversations reveal current societal needs and expectations around generative AI. Understanding these patterns and intentions is crucial for developing AI systems that genuinely enhance daily life. To gain these insights, CLOVA X leverages its proprietary CAST (CLOVA X log analysis and semantic tracking) clustering system, which analyzes conversation patterns and classifies user interactions by topic. In this post, we’ll explore how CAST works and share fascinating patterns we’ve discovered about user behavior.

Classifying infinite conversations into finite categories

A key challenge in developing CAST was categorizing vast amounts of conversational data into manageable topics and intents. This systematic classification became increasingly crucial as conversation trends evolved and our service expanded.

Our solution was to structure CLOVA X’s questions into a two-tier hierarchy: 40 broad topics and 280 specific intents.

Topics encompass high-level categories such as writing, travel, and health
Intents reflect specific user goals, like writing a book review

For instance, when a user asks “Write me a book review about The Vegetarian by Han Kang,” CAST classifies it as:

Topic: writing
Intent: write a book review

Our system uses the K-nearest algorithm to map each question to its most relevant topic and intent. Here’s an example:

Question: Can you help me plan a 3-day Tokyo itinerary focused on trying local food?
Intent: [Domestic travel plan recommendations] [International travel plan recommendations]

While this question broadly relates to travel, CAST labels it as “International travel plan recommendations” based on closest distance matching. Given our extensive conversational data, we developed a hybrid approach combining K-means and K-nearest algorithms for more robust classification.

As CLOVA X grew, we encountered new conversation types that didn’t fit our existing topic-intent system. To address this, we reapplied K-means clustering to analyze and group this unclassified data, creating new center points and regrouping them based on similarity. This process revealed interesting new patterns in user behavior.

* All questions used here were shared with user consent and edited for clarity.

Through this analysis, we observed users increasingly adopted AI as a practical daily tool. Our re-clustering of unclassified data uncovered emerging patterns in how users integrated AI into their life. By tracking the rise and fall of different question types, we continuously updated our topics and intents to strengthen the clustering framework.

AI evolution drives user behavior transformation

The integration of vision capabilities into CLOVA X in August 2024 marked a pivotal shift in how users interacted with the system.

[To learn more, read our previous post on Introducing HyperCLOVA X Vision.]

By December 2024, problem-solving queries had surged to become the second most frequent user intent. This dramatic increase demonstrates how image recognition capabilities enable users to leverage AI assistance for problem-solving tasks.

This evolution illustrates how advances in AI technology inspire users to integrate these tools more strategically into their daily lives. By understanding and adapting to these naturally emerging user patterns, AI systems are becoming increasingly effective and contextually relevant tools.

Principles of reliable CAST conversation clustering

As CAST identifies meaningful patterns in large volumes of conversational data, a crucial question arises: how does CLOVA X handle personal information throughout this analysis process? Let’s examine our comprehensive privacy safeguards.

CLOVA X prioritizes the protection of users’ personal data in its conversational analysis. While we process conversations to improve AI research and service quality, this only occurs after thorough de-identification of user conversation logs and with explicit user consent.

Personal privacy measures
- Analyze only the data that users have explicitly consented to share
- Assign encoded IDs to ensure complete user anonymity
- De-identify data to prevent the reconstruction of conversation sessions

Data summarization is handled by HyperCLOVA X, ensuring secure analysis, with strict protocols preventing data leaks or misuse. The protection of user privacy remains our fundamental principle, never to be compromised.

Conclusion

The true potential of generative AI extends far beyond initial technological curiosity. Its full value emerges when it becomes an indispensable part of daily life and users can maximize its capabilities. Our conversation clustering analysis reveals the evolving relationship between users and AI. By taking users’ questions as our guiding compass, we remain committed to developing AI that better understands and adapts to their ever-changing needs.