Third-Party Models

While OpenAI is generally recommended, there are situations where you might prefer third-party models. Agency Swarm supports proprietary providers (Anthropic, Google, AWS) and self-hosted open-source models (Llama, Mistral, etc.) through LiteLLM integration:

LiteLLM Integration

Since Agents SDK no longer uses assistants, most of the previously available frameworks became incompatible with it. One of the few frameworks that has been ported for the new SDK is LiteLLM, which you can use to connect your agent to various providers (Anthropic, Google Vertex AI, AWS Bedrock, Azure) as well as self-hosted open-source models via Ollama, vLLM, and other local serving solutions.

Using OpenAI's LiteLLM model
Using proxy server

Install LiteLLM

Install LiteLLM to get started with open-source model support:

pip install "openai-agents[litellm]"

Configure Agency Swarm Agent

Create an agent that connects to your LiteLLM proxy:

import os
from agency_swarm import Agent
from agents.extensions.models.litellm_model import LitellmModel

# Requires GOOGLE_API_KEY environment variable set
gemini_agent = Agent(
    name="GeminiAgent",
    instructions="You are a helpful assistant",
    model="litellm/gemini/gemini-2.0-flash"
)

Create and Run Agency

Set up your agency and start using third-party models:

from agency_swarm import Agency

agency = Agency(gemini_agent)

agency.terminal_demo()

Install LiteLLM

Install LiteLLM to get started with open-source model support:

pip install "litellm[proxy]"

Create LiteLLM Configuration

Create a config.yaml file to configure your models and providers:

model_list:
  - model_name: gemini-flash
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GEMINI_API_KEY # or paste your key directly here
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20240620
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: llama-groq
    litellm_params:
      model: groq/llama-3.1-70b-versatile
      api_key: os.environ/GROQ_API_KEY

general_settings:
  store_prompts_in_spend_logs: true  # Enable session management

Set Environment Variables

Add your API keys to your environment variables:

export GEMINI_API_KEY="your-gemini-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export GROQ_API_KEY="your-groq-api-key"

Start LiteLLM Proxy Server

Launch the LiteLLM proxy server with your configuration:

litellm --config /path/to/config.yaml

# Server will start on http://localhost:4000

Configure Agency Swarm Agent

Create an agent that connects to your LiteLLM proxy:

import os
from openai import AsyncOpenAI
from agency_swarm import Agent, OpenAIChatCompletionsModel

custom_client = AsyncOpenAI(
    api_key="xxx",  # Any if proxy key wasn't set
    base_url="http://localhost:4000",
)

gemini_agent = Agent(
    name="GeminiAgent",
    instructions="You are a helpful assistant",
    model=OpenAIChatCompletionsModel(
        model="gemini/gemini-2.0-flash",
        openai_client=custom_client
    )
)

Create and Run Agency

Set up your agency and start using third-party models:

from agency_swarm import Agency

agency = Agency(gemini_agent)

agency.terminal_demo()

Using model-specific tools

Some models, like gemini or claude have their internal tools, which can be attached to an agent by utilizing extra_body parameter in agent’s model_settings:

import os
from agency_swarm import Agent
from agents.extensions.models.litellm_model import LitellmModel

# Requires GOOGLE_API_KEY environment variable set
gemini_agent = Agent(
    name="GeminiAgent",
    instructions="You are a helpful assistant",
    model="litellm/gemini/gemini-2.0-flash"
)

# Requires XAI_API_KEY environment variable set
grok_agent = Agent(
    name="GrokAgent",
    instructions="You are a helpful assistant",
    model="litellm/xai/grok-4-0709"
)

Here both Grok and Gemini agents will be able to use their native search tools, which are similar to OpenAI’s WebSearch() tool. Consider checking out LiteLLM’s documentation to find a full list of supported tools.

Limitations

Be aware of the limitations when using third-party models.

Hosted tools are not supported: Patched agents are not able to utilize hosted tools, such as WebSearch, FileSearch, CodeInterpreter and others.
Patched and unpatched models should not use handoffs to communicate: You may use standard OpenAI client and patched agents in a single agency, however using handoff to transfer chat from patched model to unpatched or vice-versa will lead to an error.
Function calling may not be supported by some third-party models: This limitation prevents the agent from communicating with other agents in the agency. Therefore, it must be positioned at the end of the agency chart and cannot utilize any tools.
RAG is typically limited: Most open-source implementations have restricted Retrieval-Augmented Generation capabilities. It is recommended to develop a custom tool with your own vector database.
Potential library conflicts: the Agents SDK is still a fairly new framework which is being actively developed and improved. Due to that, there might be potential conflicts between litellm and openai-agents packages on recent releases.

For Azure OpenAI, see Azure OpenAI.

Future Plans

Updates will be provided as new open-source assistant API implementations stabilize. If you successfully integrate other projects with agency-swarm, please share your experience through an issue or pull request.

Welcome

Core Framework

Additional Features

References

Contributing

Migration

FAQ

LiteLLM Integration

Using model-specific tools

Limitations

Future Plans

Welcome

Core Framework

Additional Features

References

Contributing

Migration

FAQ

​LiteLLM Integration

​Using model-specific tools

​Limitations

​Future Plans

LiteLLM Integration

Using model-specific tools

Limitations

Future Plans