Skip to main content
Version: 0.22.0

Fishjam Agent Introduction

tip

We recommend going through the steps in the Backend Quick Start before trying Fishjam agents, as you will need a working backend server to use them.

This page gives an introduction to Fishjam agents and how to use them. You can learn more about how Agents work on the Agent Internals page.

What is an Agent?

An agent is a piece of software that allows your backend server to participate in a Fishjam room, similar to how the Fishjam client SDKs allow your client-side application to participate in a Fishjam room. They can be used to implement features such as real-time audio transcription, real-time content moderation, conversations with AI agents and more.

You can simply think of an agent as a peer running within your backend application.

Writing an Agent

In this section we show how to implement an agent using the Fishjam server SDKs. If you are not using the SDKs, then you can check out the Agent Internals, to learn how to integrate with Fishjam Agents.

Prerequisites

Before we create the actual agent, we need to create a room first, as agents are scoped to rooms. Additionally, we will also create a peer so that the agent has someone to listen to and talk to.

import { FishjamClient } from '@fishjam-cloud/js-server-sdk'; const fishjamClient = new FishjamClient({ fishjamId, managementToken }); const room = await fishjamClient.createRoom(); const peer = await fishjamClient.createPeer(room.id);

Creating a listening Agent

If you are using the server SDKs, then creating an agent and defining its behavior is very simple. By default, agents receive all peers' audio streams. However, it's likely that in your scenario you'll want to use the Selective Subscriptions API for fine-grained control over which peers/tracks they should receive audio from.

import type { AgentCallbacks, PeerOptions } from '@fishjam-cloud/js-server-sdk'; const agentOptions = { subscribeMode: 'auto', output: { audioFormat: 'pcm16', audioSampleRate: 16000 } } satisfies PeerOptions; const agentCallbacks = { onError: console.error, onClose: (code, reason) => console.log('Agent closed', code, reason) } satisfies AgentCallbacks; const { agent } = await fishjamClient.createAgent(room.id, agentOptions, agentCallbacks); // Register a callback for incoming audio data agent.on('trackData', ({ track, peerId, data }) => { // process the incoming data })

Making the Agent speak

Apart from just listening, agents can also send audio data to peers.
Let's assume that in the previous section we forwarded the peer's audio to some audio chatbot. Now, the chatbot returns responses, and we want to play it back to the peer.

tip

You can interrupt the currently played audio chunk. See the example below.

import { type AudioCodecParameters } from '@fishjam-cloud/js-server-sdk'; const codecParameters = { encoding: 'pcm16', sampleRate: 16000, channels: 1, } satisfies AudioCodecParameters; const agentTrack = agent.createTrack(codecParameters); // that's a dummy chatbot, // you can bring your audio from anywhere chatbot.on('response', (response: Uint8Array) => { agent.sendData(agentTrack.id, response); // you're able to interrupt the currently played audio chunk agent.interruptTrack(agentTrack.id); });

Disconnecting

After you're done using an agent, you can disconnect it from the room.

agent.disconnect();
Remember to disconnect your agents!

It's important to disconnect agents, because every connected agent generates usage just as a normal peer.

Pricing

Agents are billed as if they were normal peers.

For example, a room with 2 peers and 1 agent connected will be billed as if there were 3 peers connected to the room. Exact pricing values can be found on our pricing page.

Using agents without our SDKs

If you are using Fishjam's REST API directly, then check out this Agent Internals section.

See also

Learn more about how agents work:

Writing a backend server with Fishjam: