Deployment to Production
Step-by-step guide for deploying your agency in a production environment.
To deploy your agency to production, typically the process is as follows:
- Dynamically Load Conversation Threads: Required to continue conversations from where they left off
- Dynamically Load Assistant Settings: Needed to make changes to your agent’s settings persist even after redeployment
- Deploy Agents and Tools on a Production Server: Decide whether to deploy agents and tools together or separately
This guide assumes that you have already created an agency. If you haven’t, check out the Getting Started guide.
Before deploying your agency, ensure you have thoroughly tested all tools and agents in isolation and in combination. Run the test cases in each tool file and verify the agency works end-to-end using the run_demo()
or demo_gradio
methods.
Step 1: Dynamically Load Conversation Threads
By default, every time you create a new Agency()
, it starts a fresh conversation thread. However, in production environments, you typically need to pick up old conversations or handle multiple users at once.
In Agency Swarm, threads are stored in a dictionary that contains all conversation thread IDs, including those between your agents.
Loading threads from a database before processing a new request allows you to continue conversations from where they left off, even if you are using stateless backend.
Callbacks are functions that are called by the framework automatically when Agency is initialized.
Example threads callbacks:
Step 2: Dynamically Load Assistant Settings
By default, agencies store assistant settings (such as name, description, instructions, tools, and model) in a local file defined in the settings_path
parameter (settings.json
by default). While this works well for development, in production environments, we recommend storing these settings in a database to persist changes between deployments.
Settings is a list of dictionaries that contains settings of all agents. If a change is detected in the settings, the framework will automatically save the new settings to a local file and trigger the save
callback.
settings_callbacks
are executed every time agent settings are loaded or saved. Just like threads_callbacks
, you can use it to load or save agent configurations based on your identifier (e.g. user_id):
Make sure you load and return settings and threads in the exact same format as they are saved.
Step 3: Deploying Agents and Tools on a Production Server
Depending on your needs, you can deploy your agents and tools together or separately:
- Agents Together with Tools: This is the simplest method: your agents execute the tools directly, in the same environment.
- Tools as Separate API Endpoints: This is the most scalable method: multiple agents can reuse the same tools, and you can scale the tools independently.
Feature | Agents with Tools | Tools as Separate API Endpoints |
---|---|---|
Setup Complexity | ”One-click” deployment | Additional setup required |
Scalability | Combined agency scaling | Independent tool/agent scaling |
Tool Reusability | Limited to current agency | Cross-project utilization |
Cost Efficiency | Predictable resource allocation | Optimized resource scaling |
Security | Internal tool access only | API authentication required |
Best For | Small to medium projects | Large-scale or multi-project environments |
This is the simplest deployment method. You can use the official Railway template to get your agency up and running quickly.
Watch the video below for a detailed walkthrough:
Railway Deployment Template
Click here to open the template and follow the instructions provided.
The template includes a Gradio interface and REST API endpoints with proper authentication.
This is the simplest deployment method. You can use the official Railway template to get your agency up and running quickly.
Watch the video below for a detailed walkthrough:
Railway Deployment Template
Click here to open the template and follow the instructions provided.
The template includes a Gradio interface and REST API endpoints with proper authentication.
Instead of deploying agents and tools together, you can host your tools separately as serverless functions or custom APIs, then connect them to your agents using OpenAPI schemas. This approach is useful if you want to reuse tools across different projects or scale them independently. You can also use OpenAPI schemas to connect third-party tools to your agency.
You can use our Firebase template:
Firebase Deployment Template
Click here to open the template and follow the instructions provided.
When deploying tools separately, shared state between calls will not be preserved.