Skip to main content

Clustering

Luminarys supports multi-node deployments with a master-slave architecture. Clustering distributes skills across multiple nodes while presenting a unified MCP server to clients.

Roles

  • Master — runs the MCP server, maintains the unified skill registry, routes calls to the correct node. Only the master serves external clients.
  • Slave — registers its skills with the master, executes them on demand. Slaves do not expose MCP or HTTP endpoints.

Clients connect to the master and see all skills from all nodes as a single flat list. Routing is transparent — the client doesn't know which node executes which skill.

Message bus

Clustering uses a NATS server as the message bus for:

  • Node discovery and heartbeat
  • Skill registration and deregistration
  • Cross-node skill invocations
  • File transfer coordination

A standard NATS server is sufficient for skill invocations and node discovery. If you need file transfer between nodes, use the Luminarys NATS server fork — it includes a built-in file transfer relay.

Configuration

Master node

master.yaml
http:
addr: ":8080"

runtime:
memory_limit_mb: 64
cache_dir: "/app/data/cache"

log:
level: "info"
stderr: true

skills:
- config/skills/echo.yaml
- config/skills/fs.yaml

cluster:
enabled: true
cluster_id: "my-cluster"
node_id: "master"
role: "master"
nats_url: "nats://nats:4222"
relay_port: 4223
file_transfer:
allowed_nodes: ["*"]
local_dirs: ["/data:rw"]
auth:
token: "${NATS_TOKEN}"

Slave node

slave.yaml
runtime:
memory_limit_mb: 64
cache_dir: "/app/data/cache"

log:
level: "info"
stderr: true

skills:
- config/skills/echo.yaml
- config/skills/fs.yaml
- config/skills/git.yaml
- config/skills/go-toolchain.yaml

cluster:
enabled: true
cluster_id: "my-cluster"
node_id: "slave-1"
role: "slave"
nats_url: "nats://nats:4222"
relay_port: 4223
file_transfer:
allowed_nodes: ["*"]
local_dirs: ["/data:rw"]
auth:
token: "${NATS_TOKEN}"

Note: slaves don't need an http section — they don't serve external clients.

Authentication

NATS connections can be authenticated with a token:

cluster:
auth:
token: "${NATS_TOKEN}"

This corresponds to the NATS server's authorization { token: "..." } configuration.

TLS

For encrypted NATS connections:

cluster:
tls:
enabled: true
ca_file: "/etc/luminarys/tls/ca.crt"
cert_file: "/etc/luminarys/tls/client.crt"
key_file: "/etc/luminarys/tls/client.key"

Skill distribution

Each node loads its own set of skills defined in its config. When a slave starts:

  1. Connects to NATS
  2. Sends a connect request to the master
  3. Receives the current cluster skill registry
  4. Registers its own skills with the master
  5. Begins sending periodic heartbeats

The master broadcasts skill additions and removals to all nodes.

Cross-node invocation

When a client calls a skill on a remote node:

  1. Master looks up the skill in the cluster registry
  2. Routes the request via NATS to the owning node
  3. The remote node executes the skill locally
  4. Result is sent back to the master and returned to the client

From the client's perspective, all skills appear local. The MCP API is identical for local and remote skills.

File transfer

Nodes can transfer files to each other:

  1. The sender announces the file (name, size) via NATS
  2. The receiver accepts the transfer
  3. File data is streamed through the relay (TCP on relay_port)
  4. Integrity is verified after transfer

File transfer permissions are controlled per-skill in the manifest:

permissions:
file_transfer:
enabled: true
allowed_nodes: ["master"]
local_dirs: ["/data/shared:ro"]

Skills use cross-node paths with the syntax node-id:///absolute/path:

source: /data/shared/archive.tar.gz
dest: master:///data/result/archive.tar.gz

Node health

  • Heartbeat — each slave sends a heartbeat at the configured interval (default: 5s).
  • TTL — if no heartbeat is received within the TTL period (default: 15s), the node is considered offline. Its skills are removed from the registry.
  • Master restart — when the master starts, it broadcasts a "master online" message. Slaves respond by re-registering their skills, rebuilding the registry automatically.

Docker Compose example

A complete two-node cluster demo is available at github.com/LuminarysAI/mcp-demo-cluster.