Skip to content

leroy9472/MeepleLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎲 MeepleLM

A Virtual Playtester Simulating Diverse Subjective Experiences in Board Games

MeepleLM Framework Overview

Python License Training Inference Paper


📖 Table of Contents


📜 Abstract

Recent advancements have expanded the role of Large Language Models in board games from playing agents to creative co-designers. However, a critical gap remains: current systems lack the capacity to offer constructive critique grounded in the emergent user experience. Bridging this gap is fundamental for harmonizing Human-AI collaboration, as it empowers designers to refine their creations via external perspectives while steering models away from biased or unpredictable outcomes. Automating critique for board games presents two challenges: inferring the latent dynamics connecting rules to gameplay without an explicit engine, and modeling the subjective heterogeneity of diverse player groups. To address these, we curate a dataset of 1,727 structurally corrected rulebooks and 150K reviews selected via quality scoring and facet-aware sampling. We augment this data with Mechanics-Dynamics-Aesthetics (MDA) reasoning to explicitly bridge the causal gap between written rules and player experience. We further distill player personas and introduce MeepleLM, a specialized model that internalizes persona-specific reasoning patterns to accurately simulate the subjective feedback of diverse player archetypes. Experiments demonstrate that MeepleLM significantly outperforms latest commercial models (e.g., GPT-5.1, Gemini3-Pro) in community alignment and critique quality, achieving a 70% preference rate in user studies assessing utility. MeepleLM serves as a reliable virtual playtester for general interactive systems, marking a pivotal step towards audience-aligned, experience-aware Human-AI collaboration.


📂 File Structure

.
├── assets/                    # Project images and figures
├── data/
│   ├── metadata/              # Meta-info (Game IDs, names, BGG stats, splits)
│   ├── finetuning/            # Alpaca-formatted datasets
│   ├── reviews/               # Filtered review data
│   └── rulebooks/             # Structured Markdown rulebooks
├── checkpoints/               # LoRA adapters for MeepleLM & Ablations
├── training/                  # YAML configurations for LLaMA-Factory
├── inference/                 # Inference scripts (vLLM example)
└── results/                   # Generated critiques


💾 Datasets

We provide the complete pipeline data, from raw sources to instruction-tuning ready files.

  • data/metadata/:

    • game_info.json: Mappings of Game ID to metadata (Name, Rank, Weight, Year).
    • test_games_list.json: The official evaluation split (207 games) used in the paper.
  • data/finetuning/: Ready-to-use Alpaca format datasets for SFT. Each folder contains _train.json and _test.json.

    • MeepleLM/: Full dataset with MDA CoT reasoning chains.
    • wo_MDA/: Ablation without reasoning chains.
    • wo_Persona/: Ablation without persona profiles.
    • wo_Rulebook/: Ablation without rule context.
  • data/rulebooks/: The corpus of 1,727 processed rulebooks in Markdown format.

  • data/reviews/: The filtered high-quality review corpus used to construct the training data.


🤖 Models & Checkpoints

We provide LoRA adapters trained on Qwen3-8B. These can be loaded easily using vLLM.

Model Variant Description Path
MeepleLM (Ours) Full model with Persona-conditioning and MDA reasoning. ./checkpoints/MeepleLM/
w/o MDA Ablation removing Chain-of-Thought reasoning. ./checkpoints/wo_MDA/
w/o Persona Ablation using a generic player prompt. ./checkpoints/wo_Persona/
w/o Rulebook Ablation relying solely on internal knowledge. ./checkpoints/wo_Rulebook/

Serving with vLLM

You can serve the model with the LoRA adapter enabled. For example, to serve MeepleLM:

vllm serve Qwen/Qwen3-8B \
    --enable-lora \
    --lora-modules MeepleLM=checkpoints/MeepleLM \
    --served-model-name MeepleLM \
    --port 8000

🚀 Training

All models were trained using the LLaMA-Factory framework. We provide the exact YAML configurations used for our experiments in the training/ directory.

To reproduce the training process:

  1. Install LLaMA-Factory: Please refer to the official repository] for installation instructions.
  2. Register Datasets: Add the paths from data/finetuning/ to LLaMA-Factory's data/dataset_info.json.
  3. Run Training:
llamafactory-cli train training/train_meeplelm.yaml

(Note: Config files for ablation studies are also provided in the training/ folder.)


âš¡ Inference

The inference/ directory contains scripts to generate virtual playtest results.

  • playtest_inference.py: A sample script designed to work with the MeepleLM checkpoint served via vLLM. It iterates through the test set games, applying the Persona constraints to generate reviews.
  • results/: Stores the output JSON files generated by the model (e.g., results/inference_meeplelm/).

Note: The provided inference script is configured for the MeepleLM LoRA adapter and local vLLM server. If you wish to evaluate other models or use different API endpoints, please modify the API_URL and MODEL_NAME parameters in the script accordingly.


📄 Citation

If you use MeepleLM, the rulebook dataset, or the persona taxonomy in your research, please cite our paper:

@article{li2026meeplelm,
  title={MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences},
  author={Li, Zizhen and Li, Chuanhao and Wang, Yibin and Feng, Yukang and Sun, Jianwen and Ai, Jiaxin and Zhang, Fanrui and Sun, Mingzhu and Huang, Yifei and Zhang, Kaipeng},
  journal={arXiv preprint arXiv:2601.07251},
  year={2026}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors