Skill and function calling: Empowering AI to act on your behalf

HyperCLOVA X: Enhancing AI capabilities through real-time data and actions

Large language models (LLMs) have revolutionized natural language processing, but they face two critical limitations: accessing up-to-date information in real time and performing direct actions in the real world. Ask a conventional AI chatbot, “How’s the weather today?” and you might receive outdated information from its training data. Request “Add a wedding invitation to my calendar for 1 PM this Saturday,” and you’ll likely hear “I’m sorry, I can’t perform that action.” This is because traditional LLMs generate responses based solely on pre-trained data and lack the ability to connect with external data sources or interact with real-world systems.

HyperCLOVA X breaks through these barriers with two groundbreaking features: skills and function calling. Function calling is particularly powerful for developers, offering greater scalability and the ability to execute a wider variety of actions compared to skills. With these features, LLMs can now tap into external data sources and perform actual actions, delivering a richer and more practical AI experience. In this post, we’ll explore HyperCLOVA X’s innovative skill and function calling capabilities, explain their key characteristics and differences, and help you determine which approach best suits your service needs.

Connecting LLMs to external systems
HyperCLOVA X offers two powerful features—skills and function calling—that enable LLMs to interact seamlessly with external systems. Skills allow the LLM to access real-time data by calling APIs from external sources. Currently, CLOVA X integrates with popular services including NAVER Shopping, NAVER Travel, Kurly, Socar, and Triple to deliver up-to-date information to users. The beauty of this approach is its accessibility: even those without extensive development experience can use CLOVA Studio’s intuitive interface to register APIs, incorporate real-time data, or connect with in-house data systems—all to generate more relevant and helpful responses.

Using skills the smart way

Function calling is a powerful feature that enables LLMs to access external capabilities to answer users’ questions. Going beyond simple API calls, this extended feature can execute scripts and leverage library functions, allowing developers to directly call functions and perform specific actions. This advanced functionality makes it possible to automate various tasks such as sending emails, uploading files, and retrieving specific data from databases.

	Skills	Function calling
Target users	Anyone, including non-technical users	Primarily developers
Primary purpose	Access external data using APIs	Extended capabilities for complex implementations and task automation
Implementation approach	User-friendly integration through CLOVA AI Studio’s UI	Requires programming expertise for implementation and integration
Typical applications	Everyday information services: shopping recommendations, travel itineraries, weather forecasts	Complex data processing, custom action execution, system automation

Making AI smarter with tools

HyperCLOVA X’s skill and function calling capabilities elevate AI beyond basic conversation generation—they seamlessly integrate with external data sources and enable powerful real-world actions. Let’s explore practical applications of these features.

1. Building intelligent chatbots with skills and function calling
Traditional chatbot development follows a labor-intensive process: pre-defining numerous intents to understand user questions, mapping each question to the appropriate intent, and choosing matching responses. For a delivery chatbot, developers would need to classify all possible user inquiries (order status checks, delivery tracking, menu recommendations) and compile various phrasings with identical meanings as training data before selecting suitable responses for each intent.

This approach is not only resource-intensive but also unsatisfying. When users ask slightly complex questions, traditional chatbots often respond with “I didn’t understand,” resulting in customer frustration. Skills and function calling solve these challenges. The AI automatically analyzes the context of user questions, calls the appropriate API with optimal parameters, and delivers natural-sounding responses. Chatbot developers no longer need to map complex intents to every possible question variation, allowing for more intuitive and effective bot creation.

Use cases:

An e-commerce chatbot understands complex queries and automatically calls the appropriate API to generate responses.
- “Can you tell me what percentage of our buyers were women last Monday?”
- “How many visitors did the store get yesterday?”
- “What’s our total refund amount for the past three days?”
An e-commerce chatbot integrates with product search APIs to provide personalized recommendations.
- “I need a dress for cherry blossom viewing this weekend. Any suggestions?”
- “I need something to wear to a friend’s wedding next month.”

2. Accessing real-time data using function calling
Traditional AI models generate responses based solely on their training data, making them unable to provide information that requires up-to-date knowledge. Function calling transforms this limitation by enabling integration with external APIs to access real-time data, allowing the AI to generate accurate responses that reflect the latest information.

Use cases:

Exchange rates: “What’s the current exchange rate between Korean won and US dollar?”
Stock quotes: “Tell me NAVER’s current stock price.”
Price indices: “What’s today’s international gold price?”
Weather updates: “What’s the weather forecast for Seoul tomorrow?”

3. Building AI-based automation systems using function calling
Function calling goes beyond simply accessing information to integrating with external systems to perform real-world actions. With this feature, AI isn’t just a conversational assistant but a tool that can automate tasks and execute concrete actions on your behalf.

In-car assistant
- “Cool the car down to 22 degrees.”
- “Take me home.”
- “Find the cheapest gas station around here.”
Task automation
- “Draft an email for me and send it.”
- “Take notes during our meeting and share them with the team.”
Smart home control
- “Turn on the lights in the living room.”
- “Start heating the house when I’m on my way home.”

Skill vs. function calling: How the two work differently
While both skills and function calling enable AI models to integrate with external data sources, they operate in fundamentally different ways. The key differences lie in what data it uses and who controls the API calls. Let’s explore these differences in detail.

1. How skills work
With skills, the AI model autonomously determines which API to call and executes the call during its chain of thought (CoT) process.

User submits query → Planner model analyzes the query and determines which skill to use
API call → Model determines the appropriate API during its CoT process, configures the necessary parameters, and executes the call directly
Response generation → Final response model processes the API’s data and generates a natural language response

2. How function calling works
With function calling, the AI model suggests functions to run, but developers maintain control over execution. The model doesn’t directly make API calls—instead, the developer handles function execution and provides results back to the model.

User submits query → AI chooses appropriate functions from pre-defined tools
Request creation → AI extracts necessary information and creates an execution request
Function execution → AI suggests execution, but the developer performs the actual call
Response generation → Model incorporates execution results into its final response

Function calling implementation principle: Data collection and training mechanism

Function calling allows AI to automatically invoke specific features in response to user queries. To optimize this capability, we’ve developed a systematic data collection and training mechanism:

1. Data generation flow
Training an AI to effectively call functions requires defining function schemas that teach the model “what functions exist” and “when to call a specific function.” Our data generation process follows these steps:

Create and augment various function schemas
Based on these schemas, generate user questions or requests
(e.g., “How’s the weather in Seoul today?”)
The AI performs a function call and generates a response
3-1. If it performs a function call, run that function to get results or generate virtual results using AI
Using the function results and conversation history, the AI crafts a final response
If the AI doesn’t perform a function call, we return to query generation

Through repeated iterations of this process, we generate multi-turn conversational data that includes function schemas, user queries, AI responses, function calls, and function results. This comprehensive approach teaches the AI to distinguish when to call an API (like weather information) versus when to engage in general conversation (responding to “Hi”). We implement this training methodology at scale.

2. Generating function schemas
Function schemas define the functions that an AI can call and are developed from function seed data:

Data sources: Leverage existing APIs such as the Rapid API, which includes weather services and currency exchange calculation endpoints
Category definitions: Define diverse categories to expand data
- Group related tools within the same category (e.g., temperature queries and precipitation forecasts under “Weather”)
- Develop diverse categories to ensure the model trains on a wide range of scenarios
Redundancy prevention: Sample function names before inserting into prompts to prevent certain names from being repeated too many times
Error elimination: Filter out functions with malformed JSON schemas to prevent the model from learning incorrect parameter types and formats
Naming diversity: Incorporate various naming conventions including PascalCase, snake_case, camelCase, and kebab-case when generating training data

3. Generating user queries for AI interaction

User questions and requests primarily falling into two categories:

Function-requiring queries
- Requests that necessitate the use of one specific function (e.g., “What’s the weather like in Seoul right now?”)
- Requests requiring the coordination of multiple functions (e.g., “Tell me how I can get to Busan from Seoul tomorrow by train and what the weather will be like there”)
Conversational queries
- General questions (e.g., “What is artificial intelligence?” and “Hi there, nice to meet you”)

This balanced query dataset enables the AI to generate an appropriate response.

4. Generating and validating AI responses

AI responses can be largely divided into two scenarios:

Function call execution
- Verify that the AI has selected the appropriate function (e.g., confirm that a weather-related function is called in response to weather queries)
- Assess whether the correct parameters have been passed (e.g., verify that the city name “Seoul” is correctly identified and delivered to the function)
- Examine whether all necessary parameters are included without superfluous additions or critical omissions
Non-function call interactions
- Evaluate the AI’s ability to recognize information gaps and request clarification (e.g., “Which city would you like to know the weather for?”)

This rigorous validation and regeneration process is essential for enhancing response accuracy. We improve performance iteratively through this response regeneration cycle.

5. Executing functions and generating results

After validating that a function call has been correctly formulated, we proceed to the result generation phase:

Actual functions à We invoke the function to retrieve data (e.g., calling a live weather API)
AI-generated virtual functions à We employ AI to create simulated outputs

Results are generated across multiple formats (JSON, XML, etc.) to ensure data diversity. These results are then translated into natural language responses so users can better undersatnd. (e.g., “It’s clear in Seoul right now, with a temperature of 23°C.”)

6. Training and assessment

Training methodology: HyperCLOVA X model’s LoRA adapter method
Assessment framework: Berkeley Function Calling Leaderboard
Evaluation criteria:
- Calling multiple functions: Evaluate if the model can correctly call multiple functions
- Limiting function calls: Check if the model avoids calling functions when it shouldn’t
- Code generation ability: Test whether the model can successfully generate code
- Multi-step and multi-turn conversation handling: Assess if the model maintains consistency across conversations with multiple steps

This assessment framework enables us to validate the model’s function calling capabilities in real-world environments.

Comparing HyperCLOVA X’s function calling performance

HyperCLOVA X demonstrates impressive function calling capabilities despite its small model size. Our evaluation using the HCX-Dash-005, HCX-DASH-002 benchmark reveals that HyperCLOVA X achieves competitive performance when compared to the latest models from major AI companies. The performance comparison is illustrated in the graph below.

Maximizing function call efficiency: A developer’s guide

1. Craft intuitive names with comprehensive descriptions
Create function and parameter names that are immediately understandable. Avoid acronyms or abbreviations that might cause confusion. Your function descriptions should explain when to call the function and detail how different parameters affect functionality. They should also be comprehensive (at least 3-4 sentences).

2. Establish clear parameter definitions
Beyond intuitive parameter names, provide explicit descriptions. For date parameters, specify the expected format (e.g., YYYY-MM-DD and DD/MM/YY).

3. Limit parameter values through enumeration
Whenever possible, use enumerations to constrain parameter values. For example, when choosing the size of a t-shirt, you can clearly limit options to S, M, and L to enhance the accuracy of the model. This helps prevent wrong or unpredictable outputs.

4. Right-size your function library
Keep your function library manageable, ideally between 10-20 functions. Beyond this number, the model may have trouble choosing the function. If your application requires many functions, you should group related functions logically or divide functions into separate tools.

From an AI that thinks to AI that acts

AI is no longer just for conversation. HyperCLOVA X’s function calling and skill capabilities represent a transformative shift that moves AI beyond dialogue and into the realm of real-world action.

Consider these queries:
“Could you check my calendar for tomorrow and book an available meeting room?”
“Can you order some flowers to be delivered for my mom’s birthday?”

If previous AI systems were intelligent dictionaries answering questions, today’s agentic AI serves as an assistant that not only processes information but uses that data to execute meaningful tasks on your behalf.

By integrating HyperCLOVA X into your business and daily life, you can delegate repetitive tasks to AI while focusing your creativity on higher-value activities. Take time to explore how HyperCLOVA X’s function calling and skill features can enhance both your professional and personal life.