Skip to content

@ariontalk/token-server

@ariontalk/token-server is a lightweight Node.js server that mints ephemeral authentication tokens for Gemini Live sessions. It keeps your GEMINI_API_KEY on the server and issues short-lived, single-use tokens to the browser client.

Setup

Clone the repository and install dependencies:

Terminal window
git clone https://github.com/luixaviles/ariontalk.git
cd ariontalk
pnpm install

Create a .env file in packages/token-server/ (or set the variable in your hosting environment):

GEMINI_API_KEY=your-api-key-here

Start the server in development mode (from the repo root):

Terminal window
pnpm token-server

Or run the built server directly:

Terminal window
cd packages/token-server
node dist/index.js

The server listens on port 3001 by default. Set the PORT environment variable to override.

Environment Variables

VariableRequiredDescription
GEMINI_API_KEYYesGoogle AI API key used to mint ephemeral tokens via the Gemini SDK.
PORTNoHTTP listen port. Defaults to 3001.

API Endpoint

POST /api/token

Mints an ephemeral token scoped to a single Gemini Live session.

Request Body

{
model?: string; // Gemini model identifier (defaults to 'gemini-3.1-flash-live-preview')
voice?: string; // Voice name for TTS (defaults to 'Kore')
lang?: string; // BCP-47 language code (defaults to 'en')
pageTitle?: string; // Host page title — injected into the system prompt
pageUrl?: string; // Host page URL — injected into the system prompt
pageContent?: string; // Extracted page text — injected into the system prompt
}

Response

{ "token": "<ephemeral-token-string>" }

On error:

{ "error": "<message>" }

Status 500 is returned if GEMINI_API_KEY is not set or token creation fails.

Token Constraints

The server creates tokens with these constraints baked in:

ConstraintValue
uses1 (single-use)
expireTime30 minutes from creation
newSessionExpireTime2 minutes from creation
responseModalities[Modality.AUDIO]
speechConfigPrebuilt voice config with the requested voice name
inputAudioTranscriptionEnabled (empty config)
outputAudioTranscriptionEnabled (empty config)

System Prompt

When pageContent is provided in the request body, the server builds a system instruction from a Markdown template file (src/prompts/voice-assistant.md). The template uses placeholder variables that are replaced at runtime:

PlaceholderReplaced With
{{lang}}Display name of the language (e.g. 'English', 'Spanish')
{{pageTitle}}Value of pageTitle from the request
{{pageUrl}}Value of pageUrl from the request
{{pageContent}}Value of pageContent from the request

The resulting system instruction is embedded in the token’s liveConnectConstraints, so the Gemini model receives it automatically when the client connects — no client-side prompt configuration is needed.

If pageContent is omitted or empty, no system instruction is set.

CORS

The /api/token endpoint uses Hono’s cors() middleware with default settings, allowing requests from any origin. For production deployments, configure CORS to restrict allowed origins.

Deployment

The server source is split between src/app.ts (the Hono app, exported for testing) and src/index.ts (the entry point that binds @hono/node-server). It can be deployed as any standard Node.js HTTP server:

  • Local development: node dist/index.js or via the monorepo pnpm token-server script
  • Google Cloud Run: A Dockerfile and deploy script are included. Run pnpm deploy-token-server to build and deploy to Cloud Run with the API key stored in Secret Manager
  • Docker / cloud: Set GEMINI_API_KEY and PORT as environment variables
  • Serverless: Hono supports adapters for Cloudflare Workers, Vercel, Deno Deploy, and others — swap the server adapter if needed

The server is stateless; every token request is independent.