Skip to main content
To deploy your agency to production, typically the process is as follows:
  1. Dynamically Load Conversation Threads: Required to continue conversations from where they left off
  2. Dynamically Load Assistant Settings: Needed to make changes to your agent’s settings persist even after redeployment
  3. Deploy Agents and Tools on a Production Server: Decide whether to deploy agents and tools together or separately
This guide assumes that you have already created an agency. If you haven’t, check out the Getting Started guide.
Before deploying your agency, ensure you have thoroughly tested all tools and agents in isolation and in combination. Run the test cases in each tool file and verify the agency works end-to-end using demo methods.

Step 1: Dynamically Load Conversation Threads

By default, every time you create a new Agency(), it starts a fresh conversation thread. However, in production environments, you typically need to pick up old conversations or handle multiple users at once.
In Agency Swarm, threads are stored in a dictionary that contains all conversation thread IDs, including those between your agents.
Loading threads from a database before processing a new request allows you to continue conversations from where they left off, even if you are using stateless backend.Chat persistence is handled through callback functions that are passed directly as parameters to the Agency constructor:
def save_threads(thread_dict: list[dict[str, TResponseInputItem]], chat_id: str):
    # Save updated threads to your database
    # Use the provided thread_dict when persisting threads
    save_threads_to_db(thread_dict)

def load_threads(chat_id: str) -> list[dict[str, TResponseInputItem]]:
    threads = load_threads_from_db(chat_id)
    return threads

agency = Agency(
    agent1,
    agent2,
    communication_flows=[(agent1, agent2)],
    load_threads_callback=lambda: load_threads(chat_id),
    save_threads_callback=lambda thread_dict: save_threads(thread_dict, chat_id),
)

Step 2: Deploying Agents and Tools on a Production Server

Depending on your needs, you can deploy your agents and tools together or separately:
  1. Agents Together with Tools: This is the simplest method: your agents execute the tools directly, in the same environment.
  2. Tools as Separate API Endpoints: This is the most scalable method: multiple agents can reuse the same tools, and you can scale the tools independently.

Comparison Table

FeatureAgents with ToolsTools as Separate API Endpoints
Setup Complexity”One-click” deploymentAdditional setup required
ScalabilityCombined agency scalingIndependent tool/agent scaling
Tool ReusabilityLimited to current agencyCross-project utilization
Cost EfficiencyPredictable resource allocationOptimized resource scaling
SecurityInternal tool access onlyAPI authentication required
Best ForSmall to medium projectsLarge-scale or multi-project environments
  • Option 1: Agents and Tools Together
  • Option 2: Tools as Separate API Endpoints
This is the simplest deployment method. You can use the official Railway template to get your agency up and running quickly.Watch the video below for a detailed walkthrough:

Railway Deployment Template

Click here to open the template and follow the instructions provided.
The template includes a Gradio interface and REST API endpoints with proper authentication.