Native Tool Calling Support

This guide provides an overview of which providers and models natively support tool calling capabilities. The information is organized into two categories to help you choose the right model for your needs: regular tool calling and streaming tool calling support.

Regular Tool Calling Support

The following providers and models support regular tool calling capabilities, allowing you to integrate external tools and functions with your AI models:

Provider	Model
Anthropic	claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet
Azure	gpt-4o, gpt-4o-mini (UKSouth, norwayeast, swedencentral, switzerlandnorth)
Bedrock	claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet
Cerebras	Llama-3.1-8B-fp16, Llama-3.3-70B-fp16
DeepInfra	Llama-3.1-70B-fp16, Llama-3.1-8B-fp16, Llama-3.1-8B-fp8, Llama-3.3-70B-fp16, Llama-3.3-70B-fp8
Fireworks	deepseek-V3-0324-fp8, deepseek-V3-fp8, Llama-3.1-70B-fp16
Google	gemini-2.0-flash-lite-preview, gemini-2.5-pro-exp-03-25
Groq	Llama-3.3-70B-fp16
Hyperbolic	All supported models
Leptonai	Llama-3.1-70B-fp8, Llama-3.1-8B-fp8, Llama-3.3-70B-fp8
Mistral	mistral-small-24B-fp16, open-mistral-nemo
OpenAI	gpt-3.5-turbo, gpt-4.5, gpt-4o, gpt-4o-mini
Replicate	claude-3-5-sonnet
Together	Multiple Llama and other models
XAI	grok-2, grok-beta

Streaming Tool Calling Support

The following providers and models support streaming tool calling, which enables real-time interaction and response generation:

Provider	Model
Anthropic	claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet
Azure	gpt-4o, gpt-4o-mini (UKSouth, norwayeast, swedencentral, switzerlandnorth)
Bedrock	claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet
Cerebras	Llama-3.1-8B-fp16, Llama-3.3-70B-fp16
DeepInfra	Llama-3.1-70B-fp16, Llama-3.1-8B-fp16, Llama-3.1-8B-fp8, Llama-3.3-70B-fp16, Llama-3.3-70B-fp8
Fireworks	deepseek-V3-0324-fp8, deepseek-V3-fp8, Llama-3.1-70B-fp16
Google	gemini-2.0-flash-lite-preview, gemini-2.5-pro-exp-03-25
Groq	Llama-3.3-70B-fp16
Hyperbolic	All supported models
OpenAI	gpt-3.5-turbo, gpt-4.5, gpt-4o, gpt-4o-mini
Replicate	claude-3-5-sonnet
Together	Llama-3.1-70B-fp8, Llama-3.1-8B-fp8, Llama-3.3-70B-fp8, QWQ-32b-fp16, Qwen2.5-Coder-32B
XAI	grok-2, grok-beta

Usage Considerations

When using tool calling capabilities:

Regular tool calling is suitable for most applications where immediate response streaming isn't required
Streaming tool calling is ideal for interactive applications where real-time responses enhance user experience
Consider your specific use case requirements when choosing between regular and streaming implementations

For more information about implementing tool calling in your applications, check out our Basic Usage guide.

Native Tool Calling Support