#

llm-comparison

Here are 38 public repositories matching this topic...

arc53 / llm-price-compass

This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient LLM GPU selections and cost-effective AI models. LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table

benchmark gpu hacktoberfest llm llm-inference llm-price llm-comparison inference-comparison

Updated Dec 16, 2024
TypeScript

woniu9524 / ParallelChat

ParallelChat 是一款开源的多 AI 并行对话桌面应用，让你在一个界面中同时使用 ChatGPT、Kimi、Qwen、DeepSeek、GLM、Doubao、Yuanbao、Grok 等主流大模型，无需 API Key，完全免费。

chatbot model-comparison no-api-key open-source-ai multi-ai llm-comparison parallel-chat free-ai-tool ai-aggregator

Updated Apr 14, 2026
TypeScript

Imboyeong / Fact-vs-Fiction-LLM-Test-

LLM 비교 연구 기반 올인원 학습 허브 플랫폼 | GPT-4o · Gemini 2.0 Flash · Claude 4.5 Sonnet 의 정확도 분석을 토대로 학생들이 목적에 맞는 최적 모델을 선택하고 활용할 수 있도록 돕는 플랫폼을 제안합니다.

gemini model-selection korean gpt claude ai-education gpt-4 claude-ai student-learning llm-comparison interactive-platform

Updated Dec 16, 2025
Jupyter Notebook

Alorse / cc-compatible-models

Complete guide for using alternative AI models with Claude Code — including DeepSeek, Qwen, MiniMax, Kimi, GLM, MiMo, StepFun, and more. Pricing, configs, and coding plans.

glm minimax model-comparison api-integration ai-models anthropic qwen deepseek llm-pricing llm-comparison claude-code ai-coding-assistant kimi-k2

Updated Apr 20, 2026

Supahands / llm-comparison-backend

This is an opensource project allowing you to compare two LLM's head to head with a given prompt, this section will be regarding the backend of this project, allowing for llm api's to be incorporated and used in the front-end

ai llm chatgpt llm-eval llm-api llm-comparison

Updated Jan 13, 2026
Python

petmal / MindTrial

MindTrial: Evaluate and compare AI language models (LLMs) on text-based tasks with optional file/image attachments and tool use. Supports multiple providers (OpenAI, Google, Anthropic, DeepSeek, Mistral AI, xAI, Alibaba, Moonshot AI, OpenRouter), custom tasks in YAML, and HTML/CSV/JSON reports.

Updated May 29, 2026
Go

desktop-commander / best-value-ai

Where should you get your AI tokens from — local GPU, pay-per-token API, or flat-fee subscription? Compare 34+ models by quality-adjusted tokens per dollar.

ai price-comparison claude llm chatgpt local-llm ollama llm-comparison token-calculator

Updated May 11, 2026
HTML

jvrck / openrouterlist

OpenRouter model information

github-pages datatables static-site cost-estimation cost-calculator github-actions llm openrouter llm-pricing openai-compatible-api llm-comparison openrouter-api

Updated Jun 11, 2026
HTML

ctala / ai-benchmarks-alternativos

Benchmark abierto en español de 141 LLMs (89 con 13K+ runs reales y juez Phi-4 independiente). Quality, costo, velocidad, long-context y fuga de credenciales como dimensiones separadas. Alternativas a Claude, GPT y Gemini para agentes n8n/OpenClaw. Calculadora interactiva con tus propios pesos.

ai-agents startup-tools n8n ai-models emprendedores openrouter ollama llm-evaluation llm-comparison ai-benchmark llm-benchmark spanish-ai openclaw hermes-agent claude-alternatives gpt-alternatives phi4-judge benchmark-en-espanol emprendedores-ia

Updated Jun 10, 2026
Python

attogram / small-models

Comparison of small open source LLMs (8b parameters or less)

llm-evaluation llm-comparison ai-model-comparison llm-compare attogram-project

Updated Aug 30, 2025

sammy995 / Local-LLM-Arena

Privacy-first local AI model comparison platform with blind evaluation, per-model hyperparameters, and multi-configuration testing. Compare 2-6 models side-by-side through Ollama with zero cloud dependencies.

chatbot hyperparameters-tuning ai-evaluation local-ai ollama llm-comparison blind-testing

Updated Apr 4, 2026
JavaScript

bejranonda / openclaw-eval

Comprehensive benchmark of OpenRouter free-tier LLMs for practical applications. Evaluates models for coding, Thai language, and general use.

thai-language thai-nlp gemma model-evaluation openrouter ai-gateway nemotron llm-comparison llm-benchmark free-ai-models openclaw step-flash trinity-mini chatbot-benchmark free-tier-llm

Updated Feb 24, 2026

AkashKobal / aiverse

Open-source multimodal AI comparison platform to send one prompt and compare responses from multiple AI models side-by-side in real time.

java open-source machine-learning ai spring-boot web-application developer-tools ai-platform ai-tools llm prompt-engineering ai-companion llm-comparison multimodal-ai

Updated Feb 20, 2026
HTML

doubleoevan / chatwar

A fight club for LLMs. 🤫 Live demo → https://chatwar.ai

react ai google-maps chartjs http-streaming ndjson fastify vite llm llm-comparison

Updated Feb 24, 2026
TypeScript

lukecarr / litmus

Specification testing for structured LLM responses.

specification-test openrouter llm-testing llm-comparison

Updated Jun 5, 2026
Go

shimafoolad / llm-quality-evaluator

Automated evaluation framework for comparing LLM model versions using multi-metric assessment and Opik integration.

evaluator automated-testing opik llm llm-comparison

Updated Apr 17, 2026
Python

rahatmoktadir03 / llm-evaluation-platform

A full-stack web application for comparing and analyzing the performance of large language models (LLMs). Features include side-by-side prompt evaluation, performance metrics visualization, and an analytics dashboard. Built with React, Tailwind CSS, Node.js, and MongoDB."

react typescript ai fullstack-development tailwind-css large-language-models prompt-engineering llm-evaluation llm-comparison

Updated Jan 6, 2025
TypeScript

nickcarndt / document-summarizer

Compare Claude and OpenAI responses side-by-side with a built-in evaluation framework. Upload PDFs, generate summaries, ask questions using RAG, and track metrics.

typescript nextjs openai claude tailwindcss rag vercel pdf-summarization anthropic drizzle-orm llm-evaluation llm-comparison

Updated Dec 31, 2025
TypeScript

vanderheijden86 / showdown-claude-skill

Claude Code skill that pits Claude, ChatGPT, and Gemini against each other, then lets them cross-judge each other blind

gemini ai-tools chatgpt anthropic-claude llm-comparison claude-code llm-benchmark claude-skill claude-plugin

Updated Feb 11, 2026
Shell

wordenneapolitan768 / llm-pricing

Track standardized pricing for 300+ LLM APIs in one JSON file for cost tools, comparisons, and billing dashboards

Updated Jun 11, 2026

Improve this page

Add a description, image, and links to the llm-comparison topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-comparison topic, visit your repo's landing page and select "manage topics."