Safari MCP: Native Browser Automation for AI Agents on macOS
MCP

Safari MCP: Native Browser Automation for AI Agents on macOS

8 min read

Safari MCP is an open-source Model Context Protocol server (MIT license) that gives AI agents 80 native Safari tools on macOS — without launching Chrome. Uses 60% less CPU than Chrome on Apple Silicon, runs commands in ~5ms via persistent osascript, operates in the background (no focus stealing), and uses your real Safari sessions (Gmail, GitHub, Slack already logged in). Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client.

Every browser automation tool built on the Model Context Protocol — Playwright, Chrome DevTools, Browserbase — requires Chrome. On macOS, that means a separate Chromium process eating 200-400MB of RAM, spinning up your fans, and running a browser you don’t actually use for anything else.

Safari MCP takes a different approach: automate the browser you already have open.

Q2 2026 Update — What Changed for Browser Automation in 2026

Three Q2 2026 shifts make this server more relevant in 2026 than at launch:

  • AI inference costs dropped 40-60% (Q2 2026) — Claude Haiku 4.5 ($0.80/1M input, $4/1M output) and GPT-4o-mini make iterative browser-driven workflows (snapshot → reason → action → loop) economically feasible. A 50-step QA-loop that cost $0.50 in late 2025 now costs ~$0.20. Sources: Anthropic Pricing, OpenAI Pricing.
  • Native LLM browser features ship (early 2026) — Chrome added an “AI Assistant” sidebar (Gemini); Edge integrates Copilot directly. But both still require the agent to drive the page — and that agent runs locally via the protocol. The Safari-based server slots into the workflow that Comet/Atlas/Edge AI users are converging on, but stays inside the user’s existing session (cookies, logins, extensions intact).
  • Ecosystem matured (Q2 2026) — The protocol spec now has officially-supported servers from Anthropic, GitHub, Cloudflare, and others. Pairing this Safari integration with n8n 1.115’s Memory Tools (April 2026 release — n8n release notes) lets you build “browse → remember → act later” workflows in 30 minutes that previously needed a custom Python script with Playwright.

Bottom line: The 2026 reality is that AI agents drive real browsers, and the cheapest way to do that on macOS is to drive the browser you already trust. The Safari server fits that pattern.

What is MCP?

“MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications.” — Anthropic, Introducing the Model Context Protocol

Model Context Protocol (MCP) is the open standard for connecting AI agents to external tools. When your AI coding assistant needs to browse the web, fill a form, or take a screenshot, it calls an MCP server. That server executes the action and returns the result. If you’re new to AI agents, we wrote a practical guide to AI agents for small business covering what they actually do and how to build one.

“Safari uses WebKit, the open source web rendering engine that Apple developed and uses across its operating systems.” — Apple Safari Developer page

Most browser-automation servers wrap Chrome via DevTools Protocol or Playwright. The Safari approach wraps native Safari via AppleScript and a persistent Swift helper process.

Why Safari Instead of Chrome?

“Safari delivers up to 60% faster JavaScript performance than Chrome on Apple Silicon, while using significantly less memory and power.” — Apple, “Safari is faster”

“AppleScript is a scripting language created by Apple Inc. that facilitates automated control over scriptable applications on the macOS operating system.” — Apple Developer, AppleScript Overview

The practical differences matter for daily use:

Performance

It runs commands in ~5ms through a persistent osascript process. No browser startup, no DevTools Protocol overhead. On Apple Silicon, Safari’s native WebKit engine uses roughly 60% less CPU than Chrome for the same pages — which translates directly to less heat and longer battery life.

Your Real Browser

Chrome automation tools launch a fresh browser profile. That means no logins, no cookies, no sessions. Every test starts from scratch.

It uses your actual Safari — Gmail, GitHub, Slack, Ahrefs, whatever you’re logged into. Your AI agent browses as you, with your sessions intact.

No Focus Stealing

Chrome DevTools MCP and Chrome extensions steal window focus on every tool call. You’re typing in your editor, and Chrome jumps to the foreground. This is a known pain point for developers using AI browser tools.

The Safari approach operates entirely in the background. Safari never comes to the foreground unless you explicitly ask it to.

Safari MCP by the Numbers

The performance and adoption figures behind the server. Benchmark figures come from our own testing on Apple Silicon (M-series Macs); adoption figures are current as of May 2026:

  • Native Safari tools: 80 — navigation, forms, screenshots, network, storage, accessibility
  • CPU usage on Apple Silicon: in our testing, ~40% baseline vs Chrome’s ~100% — roughly 60% less
  • Command latency: ~5ms in our runs via persistent osascript, against ~50-200ms typical for Playwright and ~30-150ms for Chrome DevTools MCP
  • RAM not consumed: 200-400MB — the separate Chromium process that Chrome-based tools require
  • Browser automation tasks covered: ~95% in our experience — the remaining ~5% (Lighthouse, cross-browser) still need Chrome
  • Focus interruptions: 0 — the server runs entirely in the background, by design
  • Iterative AI cost: in our QA runs, a 50-step browser loop ≈ $0.20 with Claude Haiku 4.5
  • License: $0 — MIT open-source, with no API keys and no usage limits
  • Adoption: 67 GitHub stars and 3,700+ npm downloads per month (May 2026)
  • Setup footprint: 1 JSON config line plus 2 macOS toggles, across 5 supported MCP clients

Quick Setup

Add Safari MCP to your MCP client config:

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "safari-mcp"]
    }
  }
}

Or use the one-click install buttons for VS Code and Cursor.

Then enable JavaScript from Apple Events:

  1. Safari → Develop menu → Allow JavaScript from Apple Events
  2. Grant Automation → Safari to your IDE in System Settings

That’s it. Your AI agent can now browse Safari.

What Can It Do? (80 Tools)

The server provides 80 tools organized by category:

Navigation & Tabs — Open URLs, manage tabs, go back/forward, reload. Tab operations are safe — the AI agent tracks which tabs it opened and never touches yours.

Page Interaction — Click elements, fill forms, type text, hover, drag, scroll. Includes native CGEvent-based clicking for sites that require trusted events.

Data Extraction — Read page content, extract links, tables, images, metadata. Get computed styles, accessibility snapshots, structured data.

Screenshots & Visual — Capture full-page or element-specific screenshots. Device emulation for responsive testing.

Network & Performance — Monitor network requests, mock API responses, throttle connections, measure page load performance.

Storage & State — Read and write cookies, localStorage, sessionStorage, IndexedDB. Export and import storage snapshots.

Accessibility — Full accessibility tree snapshots for testing screen reader compatibility and WCAG compliance.

Framework Support

Modern web apps use frameworks that intercept standard DOM events. Setting .value on a React input doesn’t trigger state updates — the framework’s internal state tracker ignores it.

It solves this with framework-aware form filling:

  • React — Uses _valueTracker to reset React’s change detection before dispatching events
  • Vue.js — Triggers input events that v-model listens for
  • Angular — Dispatches both input and change events for reactive forms
  • ProseMirror / Draft.js / Slate / Lexical — Uses execCommand('insertText') + OS-level clipboard paste for rich text editors. The deep-dive on why MCP tools silently fail on rich-text editors explains the isTrusted security boundary that trips up LinkedIn, Notion, and Google Docs automation.
  • Shadow DOM — Reaches into closed shadow roots that standard automation can’t access

Safari MCP vs Playwright vs Chrome DevTools MCP

FeatureSafari MCPPlaywrightChrome DevTools MCP
Browser engineNative WebKit (Safari)Chromium/Firefox/WebKitChromium
CPU usage on Apple Silicon~40% baseline~100% baseline (Chrome)~100% baseline (Chrome)
Real user sessions (cookies/logins)✅ Yes — your actual Safari❌ Fresh profile❌ Fresh profile
Focus stealing❌ Never — background⚠️ Sometimes⚠️ Frequently
Command latency~5ms (persistent osascript)~50-200ms~30-150ms
Setup complexityOne JSON line + 2 togglesnpm install + browsersChrome flags + DevTools
LicenseMIT (open-source)Apache 2.0 (Microsoft)MIT (Google)
Tool count80 native tools~40 built-in~30 built-in
Lighthouse audits❌ Not supported⚠️ Via API✅ Native
Cross-browser testing❌ Safari only✅ Chrome/Firefox/WebKit❌ Chrome only
Rich-text editors (Lexical/ProseMirror)✅ Native paths (v2.9.4+)⚠️ Synthetic events fail⚠️ Synthetic events fail
Shadow DOM access✅ Includes closed roots⚠️ Open only⚠️ Open only
Network mocking✅ Yes✅ Yes✅ Yes
Best forDaily AI agent browsing on macOSCross-browser CI/CDChrome perf diagnostics

When to Use Chrome Instead

The Safari server handles 95% of browser automation tasks. For the remaining 5%, Chrome DevTools MCP is better:

  • Lighthouse audits — Chrome-only feature
  • Chrome-specific DevTools ��� Performance traces, Memory snapshots, Coverage reports
  • Cross-browser testing — When you specifically need Chrome rendering behavior

The two servers complement each other. Use Safari MCP for daily browsing and testing, Chrome DevTools MCP for Chrome-specific diagnostics. For a broader look at how we use automation in production, see our business automation guide, our n8n self-hosted setup guide for the orchestration layer, and our deep-dive on MCP browser automation for rich text editors. For comparing automation platforms see n8n vs Make vs Zapier. For the strategic framing of where browser-driving AI fits in a small-business stack overall, see our AI in business overview.

Open Source

Safari MCP is MIT licensed and open source. The entire codebase is two JavaScript files — safari.js for the automation layer and index.js for the MCP server.

Currently at 28 GitHub stars and 2,000+ monthly npm downloads, with contributions from the community including TypeScript declarations, bug fixes, and cross-project collaboration with other macOS MCP tool authors.

Links:

Losing leads because no one's answering?

A WhatsApp bot answers, schedules, and captures leads 24/7 — from $1,000 one-time. Free consultation →

Get a Custom Quote

Prefer to chat? WhatsApp me · full pricing · our projects

Achiya - Business automation and bot specialist

Achiya Cohen

Business Automation Expert · Building bots since 2023

Built 50+ automation systems for businesses — WhatsApp bots, CRM integrations, and automated workflows that save hours of work every day. Specializing in n8n, Make, and WhatsApp Business API.

Ready to automate your business?

50+ businesses already save 15 hours/week. Tell me about yours — I'll show you exactly what we can automate.

Get a Custom Quote

Prefer WhatsApp? Message me →

Response within hours · No commitment

Share this article:

Frequently Asked Questions

What is Safari MCP?
It is an open-source Model Context Protocol server that gives AI agents like Claude Code, Cursor, and Windsurf native control over Safari on macOS. The server provides 80 browser automation tools — navigation, form filling, screenshots, network monitoring, and more — without requiring Chrome, Puppeteer, or Playwright.
How does it compare to Playwright or Chrome DevTools?
The Safari-based server uses native AppleScript and WebKit instead of Chrome DevTools Protocol. This means 60% less CPU usage on Apple Silicon, no separate browser instance, and access to your existing Safari logins and sessions. The Chrome alternatives require launching Chrome, which consumes significantly more resources.
Does it work with Claude Code?
Yes. The server works with any compatible client, including Claude Code (CLI and VS Code extension), Cursor, Windsurf, Claude Desktop, and VS Code with Continue. Setup takes one command — just add the server config to your settings.
Is it free?
Yes. The project is fully open-source under the MIT license. It is free to use, modify, and distribute. No API keys, no subscriptions, no usage limits.
What macOS version does it require?
The server works on any macOS version that includes Safari with AppleScript support. It runs natively on both Apple Silicon (M1/M2/M3/M4) and Intel Macs. Apple Silicon benefits from significantly lower power consumption compared to running Chrome.
Can it fill forms on React and Angular sites?
Yes. The form-filling implementation works with React controlled inputs, Vue.js v-model bindings, Angular forms, and Svelte — using native value setters that trigger the correct framework events. It also supports Shadow DOM, ProseMirror editors, and Draft.js.