The AI that never
phones home.

Halfmoon is full-featured AI chat that runs entirely in your browser. Open-source models on your own device — no cloud, no account, nothing to leak.

Start chatting Free · No sign-up · Works offline

servers involved

160+

open models to choose from

100%

runs on your device

Why Halfmoon

Privacy you don't have to take on faith.

Private by physics

Not by promise. The model runs on your GPU, inside your browser. Your words never cross the network, so there's nothing to intercept, store, or subpoena.

Offline is a feature

Add it to your home screen like an app. Once a model is downloaded, Halfmoon works on a plane, in a tunnel, off the grid.

Bring any mind

Llama, Qwen, Gemma, Phi, Mistral — 160+ open models from phone-sized to desktop-class, with full control over system prompt and sampling.

How it works

Three steps. Zero accounts.

Pick a model

Small ones fly on phones; big ones think harder on desktops. Swap anytime.

Wait once

The model downloads to your browser's cache — a first-time-only wait.

Chat forever

Streaming replies, code, markdown, history. All local, all free.

Questions

The fine print, unfined.

Is it really free?

Yes. Your device does the computing, so there's nothing for us to meter. No subscription, no token limits, no ads.

How is this private, exactly?

Halfmoon uses WebGPU to run the language model inside your browser tab. Prompts and responses are processed on your hardware and stored only in your browser. There is no backend — the page is just files.

What do I need?

A browser with WebGPU — Chrome or Edge 113+, Safari 26+ (including iPhones on iOS 26+), or Firefox 141+ — and 1–2 GB of free space for a model. Recent phones handle the small models comfortably.

Do I have to install anything?

No. It's a web page. But if you tap “Add to Home Screen,” it behaves like a native app — full-screen, with an icon, working offline.

Chat like nobody's watching.

Because nobody is.

Open Halfmoon