Back to MCP Servers

Minutes MCP Server

Local-first conversation memory MCP server: records, transcribes, and indexes meetings on-device, exposing them to AI agents as plain markdown.

Collaboration by silverstein (open-source) None active
Overview

Minutes is an open-source, privacy-first conversation memory layer for AI agents. It captures audio from meetings and voice memos, transcribes locally using Whisper, Parakeet, or Apple Speech, performs speaker diarization, and stores everything as plain markdown with YAML frontmatter in ~/meetings/. The MCP server then exposes that corpus to any MCP-compatible agent (Claude Desktop, Claude Code, Codex, OpenCode, Gemini CLI, Cursor) without API keys or cloud dependencies.

Beyond raw transcripts, Minutes extracts structured data: decisions, action items, commitments, and people. Through 29 MCP tools and 7 resources, agents can search across meetings, build cross-meeting profiles of contacts, surface overdue tasks, flag conflicting decisions, and stream live transcripts mid-meeting for real-time coaching. The interactive dashboard surface (via MCP Apps) renders meeting lists and people profiles as inline UI in supporting clients.

What makes Minutes notable is its ownership model. Files are plain markdown that grep, ripgrep, or any future LLM can read without an SDK. There is no vendor account, no API key, and no cloud sync by default. The MCP layer is the active interface over a durable file substrate, designed so the corpus outlives any specific tool that reads it.

Tools

Tool Description
start_recording Begin local audio capture for a new meeting.
stop_recording Finalize the active recording and trigger transcription.
process_audio Ingest an external audio file and transcribe it into the corpus.
list_meetings Browse recent recordings.
get_meeting Retrieve the full transcript and structured frontmatter for a meeting.
search_meetings Full-text search across all meetings.
get_person_profile Build a cross-meeting profile of a contact including topics and commitments.
list_people List all known contacts with frequency and relationship health signals.
consistency_report Flag contradicting decisions and stale commitments across meetings.
get_action_items Query open and overdue tasks, optionally filtered by assignee.
get_decisions Retrieve decisions by topic or date.
get_commitments Track promises made to specific people.
start_live_transcript Open a real-time transcription stream written as JSONL.
read_live_transcript Delta read of an in-progress live transcript for mid-meeting coaching.
list_voices List enrolled speaker voice profiles.
confirm_speaker Manually attribute a speaker segment to a known person.
vault_sync Bidirectional sync with an Obsidian or Logseq vault.
start_dictation Streaming speech-to-text dictation outside of meeting context.
Setup Guide

Prerequisites

  • Node.js (for the npx install path) or the Rust toolchain to build from source
  • Minutes CLI configured with a meeting directory at ~/meetings/
  • A local transcription engine: Whisper, Parakeet, or Apple Speech (configured via ~/.config/minutes/config.toml)

Quick install

Run the MCP server directly with npx (no global install needed):

npx minutes-mcp

Claude Desktop config

Add to ~/.claude/resources/claude_desktop_config.json:

{
  "mcpServers": {
    "minutes": {
      "command": "npx",
      "args": ["minutes-mcp"]
    }
  }
}

Other MCP-compatible clients (Codex, OpenCode, Gemini CLI, custom agents) can point at the same stdio server.

Optional configuration

Local tuning lives in ~/.config/minutes/config.toml:

[transcription]
model = "small"      # tiny, base, small, medium, large-v3
engine = "whisper"   # or "parakeet", "apple-speech"

[summarization]
engine = "none"      # let the calling agent summarize via MCP
# or: "agent", "ollama", "openai-compatible"

[diarization]
engine = "auto"

No API keys are required. The server reads and writes only local files.

Use Cases
  • Ask an agent "What did I promise Sarah in our last three meetings?" and get a synthesized answer pulled from local transcripts and commitments.
  • Run a daily standup briefing where the agent reads minutes://actions/overdue and minutes://actions/open to surface what slipped.
  • Get live coaching mid-call: start start_live_transcript, have the agent stream read_live_transcript, and suggest follow-ups in real time.
  • Audit consistency across decisions: run consistency_report to find contradictory pricing, scope, or timeline calls made in different meetings.
  • Sync transcripts into an Obsidian vault via vault_sync so meeting notes live alongside personal knowledge management.
Example Prompts
  • "List my meetings from the last 7 days and pull the open action items assigned to me."
  • "Search my meetings for any discussion of the Q2 pricing experiment and summarize the decisions."
  • "Start a live transcript for this call and flag any commitments I make as they happen."
  • "Build a profile for Alex: what topics have we covered, what did I commit to, and what is still open?"
  • "Run a consistency report and tell me where I have made contradicting decisions about hiring this quarter."
Pros
  • Fully local and private: no API keys, no cloud sync, no vendor account required.
  • Plain markdown storage with YAML frontmatter, so the corpus stays readable by grep and any future tool.
  • Broad tool surface (29 tools, 7 resources) covering recording, retrieval, people intelligence, live coaching, and vault sync.
  • Works with multiple transcription backends (Whisper, Parakeet, Apple Speech) and multiple MCP clients out of the box.
Limitations
  • Local-only by design: sharing a meeting corpus across machines or teammates requires manual sync setup.
  • Recording-to-transcript is batch, not streaming; only start_live_transcript provides real-time partial results.
  • Speaker diarization accuracy degrades on noisy or heavily overlapping multi-party audio.
  • Single-writer assumption: multiple agents editing the same meeting file simultaneously can conflict.
Alternatives