Native Tool Calling Support
This guide provides an overview of which providers and models natively support tool calling capabilities. The information is organized into two categories to help you choose the right model for your needs: regular tool calling and streaming tool calling support.
Regular Tool Calling Support
The following providers and models support regular tool calling capabilities, allowing you to integrate external tools and functions with your AI models:
Provider | Model |
---|---|
Anthropic | claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet |
Azure | gpt-4o, gpt-4o-mini (UKSouth, norwayeast, swedencentral, switzerlandnorth) |
Bedrock | claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet |
Cerebras | Llama-3.1-8B-fp16, Llama-3.3-70B-fp16 |
DeepInfra | Llama-3.1-70B-fp16, Llama-3.1-8B-fp16, Llama-3.1-8B-fp8, Llama-3.3-70B-fp16, Llama-3.3-70B-fp8 |
Fireworks | deepseek-V3-0324-fp8, deepseek-V3-fp8, Llama-3.1-70B-fp16 |
gemini-2.0-flash-lite-preview, gemini-2.5-pro-exp-03-25 | |
Groq | Llama-3.3-70B-fp16 |
Hyperbolic | All supported models |
Leptonai | Llama-3.1-70B-fp8, Llama-3.1-8B-fp8, Llama-3.3-70B-fp8 |
Mistral | mistral-small-24B-fp16, open-mistral-nemo |
OpenAI | gpt-3.5-turbo, gpt-4.5, gpt-4o, gpt-4o-mini |
Replicate | claude-3-5-sonnet |
Together | Multiple Llama and other models |
XAI | grok-2, grok-beta |
Streaming Tool Calling Support
The following providers and models support streaming tool calling, which enables real-time interaction and response generation:
Provider | Model |
---|---|
Anthropic | claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet |
Azure | gpt-4o, gpt-4o-mini (UKSouth, norwayeast, swedencentral, switzerlandnorth) |
Bedrock | claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet |
Cerebras | Llama-3.1-8B-fp16, Llama-3.3-70B-fp16 |
DeepInfra | Llama-3.1-70B-fp16, Llama-3.1-8B-fp16, Llama-3.1-8B-fp8, Llama-3.3-70B-fp16, Llama-3.3-70B-fp8 |
Fireworks | deepseek-V3-0324-fp8, deepseek-V3-fp8, Llama-3.1-70B-fp16 |
gemini-2.0-flash-lite-preview, gemini-2.5-pro-exp-03-25 | |
Groq | Llama-3.3-70B-fp16 |
Hyperbolic | All supported models |
OpenAI | gpt-3.5-turbo, gpt-4.5, gpt-4o, gpt-4o-mini |
Replicate | claude-3-5-sonnet |
Together | Llama-3.1-70B-fp8, Llama-3.1-8B-fp8, Llama-3.3-70B-fp8, QWQ-32b-fp16, Qwen2.5-Coder-32B |
XAI | grok-2, grok-beta |
Usage Considerations
When using tool calling capabilities:
- Regular tool calling is suitable for most applications where immediate response streaming isn't required
- Streaming tool calling is ideal for interactive applications where real-time responses enhance user experience
- Consider your specific use case requirements when choosing between regular and streaming implementations
For more information about implementing tool calling in your applications, check out our Basic Usage guide.