promptside

See how your prompt performs across LLMs — side by side, in seconds.

promptside runs the same prompt across multiple language models and shows you a beautiful side-by-side comparison of outputs, token usage, latency, and cost. Built for the workflow every AI dev now has: "a new model dropped — did my prompts regress?"

npx promptside "Explain transformers to a 10-year-old" \
  --models claude-opus-4-7,gpt-5,gemini-2.5-flash

Outputs a side-by-side terminal view and a self-contained HTML report you can share.

Why

Every time a new frontier model ships, you want to know:

Does my prompt still work?
Which model gives the best answer for my use case?
What's the cost/latency tradeoff?

Existing tools (Promptfoo, Braintrust, etc.) are powerful but heavy — config files, eval frameworks, dashboards, signups. promptside is the opposite: one command, no signup, instant visual diff.

Install

npm install -g promptside
# or run directly
npx promptside

Usage

Quick comparison

promptside "Write a haiku about debugging" \
  --models claude-opus-4-7,gpt-5,gemini-2.5-flash

From a prompt file

Create a .prompt.md file (see examples/ for more):

---
models:
  - anthropic:claude-opus-4-7
  - openai:gpt-5
  - google:gemini-2.5-flash
max_tokens: 64
---

Write a haiku about debugging.

Then:

promptside run examples/demo-haiku.prompt.md

Watch mode

Re-run automatically on file save:

promptside run myprompt.prompt.md --watch

HTML report

promptside "your prompt" --models claude-opus-4-7,gpt-5 --html report.html
open report.html

API keys

Set these in your environment:

export ANTHROPIC_API_KEY=...
export OPENAI_API_KEY=...
export GOOGLE_API_KEY=...

promptside only calls the providers you actually use.

Note: Gemini's free tier has aggressive rate limits and may return 503/429 errors during peak demand. If you hit this, wait a few minutes or switch to a paid API key at aistudio.google.com/apikey.

Output

Each run captures, per model:

Full output text
Input / output tokens
Latency (ms)
Cost (USD)
Character-level diff against the other models' outputs

Comparison

	promptside	Promptfoo	Braintrust
Setup time	30 seconds	~10 min	Signup required
Config	Optional `.prompt.md`	YAML eval files	Cloud dashboard
Local-first	✅	✅	❌
Visual diff	✅	❌	Partial
Eval framework	❌ (by design)	✅	✅
Best for	Quick prompt comparisons	Full eval pipelines	Team prompt management

promptside is the tool you reach for when a model drops and you want to know in 30 seconds whether your prompts still work. For full eval pipelines, use Promptfoo. For team workflows, use Braintrust.

Roadmap

Contributing

PRs welcome. Adapter contributions especially appreciated — see src/adapters/ for the pattern.

License

MIT

Built by @lucalouren. If promptside saves you time, a star helps a lot. ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
bin		bin
examples		examples
fixtures		fixtures
scripts		scripts
src		src
.gitignore		.gitignore
.npmignore		.npmignore
IDEAS.md		IDEAS.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

promptside

Why

Install

Usage

Quick comparison

From a prompt file

Watch mode

HTML report

API keys

Output

Comparison

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

promptside

Why

Install

Usage

Quick comparison

From a prompt file

Watch mode

HTML report

API keys

Output

Comparison

Roadmap

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages