To deploy your agency to production, typically the process is as follows:

  1. Dynamically Load Conversation Threads: Required to continue conversations from where they left off
  2. Dynamically Load Assistant Settings: Needed to make changes to your agent’s settings persist even after redeployment
  3. Deploy Agents and Tools on a Production Server: Decide whether to deploy agents and tools together or separately

This guide assumes that you have already created an agency. If you haven’t, check out the Getting Started guide.

Before deploying your agency, ensure you have thoroughly tested all tools and agents in isolation and in combination. Run the test cases in each tool file and verify the agency works end-to-end using the run_demo() or demo_gradio methods.

Step 1: Dynamically Load Conversation Threads

By default, every time you create a new Agency(), it starts a fresh conversation thread. However, in production environments, you typically need to pick up old conversations or handle multiple users at once.

In Agency Swarm, threads are stored in a dictionary that contains all conversation thread IDs, including those between your agents.

Loading threads from a database before processing a new request allows you to continue conversations from where they left off, even if you are using stateless backend.

Callbacks are functions that are called by the framework automatically when Agency is initialized.

Example threads callbacks:

def load_threads(chat_id):
    # Load threads from your database using the chat_id
    threads = load_threads_from_db(chat_id)
    return threads

def save_threads(new_threads):
    # Save updated threads to your database
    save_threads_to_db(new_threads)

agency = Agency(
    ...
    threads_callbacks={
        'load': lambda: load_threads(chat_id),
        'save': lambda new_threads: save_threads(new_threads)
    },
)

Step 2: Dynamically Load Assistant Settings

By default, agencies store assistant settings (such as name, description, instructions, tools, and model) in a local file defined in the settings_path parameter (settings.json by default). While this works well for development, in production environments, we recommend storing these settings in a database to persist changes between deployments.

Settings is a list of dictionaries that contains settings of all agents. If a change is detected in the settings, the framework will automatically save the new settings to a local file and trigger the save callback.

settings_callbacks are executed every time agent settings are loaded or saved. Just like threads_callbacks, you can use it to load or save agent configurations based on your identifier (e.g. user_id):

def load_settings(user_id):
    # Load settings from your database using the user_id
    settings = load_settings_from_db(user_id)
    return settings

def save_settings(new_settings):
    # Save updated settings to your database
    save_settings_to_db(new_settings)

agency = Agency(
    ...
    settings_callbacks={
        'load': lambda: load_settings(user_id),
        'save': lambda new_settings: save_settings(new_settings)
    },
)

Make sure you load and return settings and threads in the exact same format as they are saved.

Step 3: Deploying Agents and Tools on a Production Server

Depending on your needs, you can deploy your agents and tools together or separately:

  1. Agents Together with Tools: This is the simplest method: your agents execute the tools directly, in the same environment.
  2. Tools as Separate API Endpoints: This is the most scalable method: multiple agents can reuse the same tools, and you can scale the tools independently.

This is the simplest deployment method. You can use the official Railway template to get your agency up and running quickly.

Watch the video below for a detailed walkthrough:

Railway Deployment Template

Click here to open the template and follow the instructions provided.

The template includes a Gradio interface and REST API endpoints with proper authentication.