@ariontalk/token-server

@ariontalk/token-server is a lightweight Node.js server that mints ephemeral authentication tokens for Gemini Live sessions. It keeps your GEMINI_API_KEY on the server and issues short-lived, single-use tokens to the browser client.

Setup

Clone the repository and install dependencies:

git clone https://github.com/luixaviles/ariontalk.git
cd ariontalk
pnpm install

Create a .env file in packages/token-server/ (or set the variable in your hosting environment):

GEMINI_API_KEY=your-api-key-here

Start the server in development mode (from the repo root):

pnpm token-server

Or run the built server directly:

cd packages/token-server
node dist/index.js

The server listens on port 3001 by default. Set the PORT environment variable to override.

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google AI API key used to mint ephemeral tokens via the Gemini SDK.
`PORT`	No	HTTP listen port. Defaults to `3001`.

API Endpoint

`POST /api/token`

Mints an ephemeral token scoped to a single Gemini Live session.

Request Body

{
  model?: string;       // Gemini model identifier (defaults to 'gemini-3.1-flash-live-preview')
  voice?: string;       // Voice name for TTS (defaults to 'Kore')
  lang?: string;        // BCP-47 language code (defaults to 'en')
  pageTitle?: string;   // Host page title — injected into the system prompt
  pageUrl?: string;     // Host page URL — injected into the system prompt
  pageContent?: string; // Extracted page text — injected into the system prompt
}

Response

{ "token": "<ephemeral-token-string>" }

On error:

{ "error": "<message>" }

Status 500 is returned if GEMINI_API_KEY is not set or token creation fails.

Token Constraints

The server creates tokens with these constraints baked in:

Constraint	Value
`uses`	`1` (single-use)
`expireTime`	30 minutes from creation
`newSessionExpireTime`	2 minutes from creation
`responseModalities`	`[Modality.AUDIO]`
`speechConfig`	Prebuilt voice config with the requested voice name
`inputAudioTranscription`	Enabled (empty config)
`outputAudioTranscription`	Enabled (empty config)

System Prompt

When pageContent is provided in the request body, the server builds a system instruction from a Markdown template file (src/prompts/voice-assistant.md). The template uses placeholder variables that are replaced at runtime:

Placeholder	Replaced With
`{{lang}}`	Display name of the language (e.g. `'English'`, `'Spanish'`)
`{{pageTitle}}`	Value of `pageTitle` from the request
`{{pageUrl}}`	Value of `pageUrl` from the request
`{{pageContent}}`	Value of `pageContent` from the request

The resulting system instruction is embedded in the token’s liveConnectConstraints, so the Gemini model receives it automatically when the client connects — no client-side prompt configuration is needed.

If pageContent is omitted or empty, no system instruction is set.

CORS

The /api/token endpoint uses Hono’s cors() middleware with default settings, allowing requests from any origin. For production deployments, configure CORS to restrict allowed origins.

Deployment

The server source is split between src/app.ts (the Hono app, exported for testing) and src/index.ts (the entry point that binds @hono/node-server). It can be deployed as any standard Node.js HTTP server:

Local development: node dist/index.js or via the monorepo pnpm token-server script
Google Cloud Run: A Dockerfile and deploy script are included. Run pnpm deploy-token-server to build and deploy to Cloud Run with the API key stored in Secret Manager
Docker / cloud: Set GEMINI_API_KEY and PORT as environment variables
Serverless: Hono supports adapters for Cloudflare Workers, Vercel, Deno Deploy, and others — swap the server adapter if needed

The server is stateless; every token request is independent.