🎲 MeepleLM

A Virtual Playtester Simulating Diverse Subjective Experiences in Board Games

📖 Table of Contents

📜 Abstract
📂 File Structure
💾 Datasets
🤖 Models & Checkpoints
🚀 Training
⚡ Inference & Evaluation
📄 Citation

📜 Abstract

Recent advancements have expanded the role of Large Language Models in board games from playing agents to creative co-designers. However, a critical gap remains: current systems lack the capacity to offer constructive critique grounded in the emergent user experience. Bridging this gap is fundamental for harmonizing Human-AI collaboration, as it empowers designers to refine their creations via external perspectives while steering models away from biased or unpredictable outcomes. Automating critique for board games presents two challenges: inferring the latent dynamics connecting rules to gameplay without an explicit engine, and modeling the subjective heterogeneity of diverse player groups. To address these, we curate a dataset of 1,727 structurally corrected rulebooks and 150K reviews selected via quality scoring and facet-aware sampling. We augment this data with Mechanics-Dynamics-Aesthetics (MDA) reasoning to explicitly bridge the causal gap between written rules and player experience. We further distill player personas and introduce MeepleLM, a specialized model that internalizes persona-specific reasoning patterns to accurately simulate the subjective feedback of diverse player archetypes. Experiments demonstrate that MeepleLM significantly outperforms latest commercial models (e.g., GPT-5.1, Gemini3-Pro) in community alignment and critique quality, achieving a 70% preference rate in user studies assessing utility. MeepleLM serves as a reliable virtual playtester for general interactive systems, marking a pivotal step towards audience-aligned, experience-aware Human-AI collaboration.

📂 File Structure

.
├── assets/                    # Project images and figures
├── data/
│   ├── metadata/              # Meta-info (Game IDs, names, BGG stats, splits)
│   ├── finetuning/            # Alpaca-formatted datasets
│   ├── reviews/               # Filtered review data
│   └── rulebooks/             # Structured Markdown rulebooks
├── checkpoints/               # LoRA adapters for MeepleLM & Ablations
├── training/                  # YAML configurations for LLaMA-Factory
├── inference/                 # Inference scripts (vLLM example)
└── results/                   # Generated critiques

💾 Datasets

We provide the complete pipeline data, from raw sources to instruction-tuning ready files.

data/metadata/:
- game_info.json: Mappings of Game ID to metadata (Name, Rank, Weight, Year).
- test_games_list.json: The official evaluation split (207 games) used in the paper.
data/finetuning/: Ready-to-use Alpaca format datasets for SFT. Each folder contains _train.json and _test.json.
- MeepleLM/: Full dataset with MDA CoT reasoning chains.
- wo_MDA/: Ablation without reasoning chains.
- wo_Persona/: Ablation without persona profiles.
- wo_Rulebook/: Ablation without rule context.
data/rulebooks/: The corpus of 1,727 processed rulebooks in Markdown format.
data/reviews/: The filtered high-quality review corpus used to construct the training data.

🤖 Models & Checkpoints

We provide LoRA adapters trained on Qwen3-8B. These can be loaded easily using vLLM.

Model Variant	Description	Path
MeepleLM (Ours)	Full model with Persona-conditioning and MDA reasoning.	`./checkpoints/MeepleLM/`
w/o MDA	Ablation removing Chain-of-Thought reasoning.	`./checkpoints/wo_MDA/`
w/o Persona	Ablation using a generic player prompt.	`./checkpoints/wo_Persona/`
w/o Rulebook	Ablation relying solely on internal knowledge.	`./checkpoints/wo_Rulebook/`

Serving with vLLM

You can serve the model with the LoRA adapter enabled. For example, to serve MeepleLM:

vllm serve Qwen/Qwen3-8B \
    --enable-lora \
    --lora-modules MeepleLM=checkpoints/MeepleLM \
    --served-model-name MeepleLM \
    --port 8000

🚀 Training

All models were trained using the LLaMA-Factory framework. We provide the exact YAML configurations used for our experiments in the training/ directory.

To reproduce the training process:

Install LLaMA-Factory: Please refer to the official repository] for installation instructions.
Register Datasets: Add the paths from data/finetuning/ to LLaMA-Factory's data/dataset_info.json.
Run Training:

llamafactory-cli train training/train_meeplelm.yaml

(Note: Config files for ablation studies are also provided in the training/ folder.)

⚡ Inference

The inference/ directory contains scripts to generate virtual playtest results.

playtest_inference.py: A sample script designed to work with the MeepleLM checkpoint served via vLLM. It iterates through the test set games, applying the Persona constraints to generate reviews.
results/: Stores the output JSON files generated by the model (e.g., results/inference_meeplelm/).

Note: The provided inference script is configured for the MeepleLM LoRA adapter and local vLLM server. If you wish to evaluate other models or use different API endpoints, please modify the API_URL and MODEL_NAME parameters in the script accordingly.

📄 Citation

If you use MeepleLM, the rulebook dataset, or the persona taxonomy in your research, please cite our paper:

@article{li2026meeplelm,
  title={MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences},
  author={Li, Zizhen and Li, Chuanhao and Wang, Yibin and Feng, Yukang and Sun, Jianwen and Ai, Jiaxin and Zhang, Fanrui and Sun, Mingzhu and Huang, Yifei and Zhang, Kaipeng},
  journal={arXiv preprint arXiv:2601.07251},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎲 MeepleLM

📖 Table of Contents

📜 Abstract

📂 File Structure

💾 Datasets

🤖 Models & Checkpoints

Serving with vLLM

🚀 Training

⚡ Inference

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
checkpoints		checkpoints
data		data
inference		inference
results/inference_meeplelm		results/inference_meeplelm
training		training
.gitattributes		.gitattributes
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🎲 MeepleLM

📖 Table of Contents

📜 Abstract

📂 File Structure

💾 Datasets

🤖 Models & Checkpoints

Serving with vLLM

🚀 Training

⚡ Inference

📄 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages