Stars
[Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
[RSS 2026] Code for RISE: Self-Improving Robot Policy with Compositional World Model
OpenGame: Open Agentic Coding for Games
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
Awesome Multimodal Modeling [Covers MLLM, UMM, and NMM]
Unified Codebase for Advanced World Models.
Gen-Searcher: Reinforcing Agentic Search for Image Generation
[CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs/2603.19232
JavaScript in-page GUI agent. Control web interfaces with natural language.
[CVPR 2026🔥] Enhancing Spatial Understanding in Image Generation via Reward Modeling
BitDance custom nodes for ComfyUI with unified loader, text encode, sampler, and VAE nodes.
BitDance & UniWeTok: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model.
[ICML 2026] | Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
[NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models across multiple dimensions, including subject-element alignment,…
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
omo; the best agent harness - previously oh-my-opencode
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …
[NeurIPS 2025] | DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
🔥 OneThinker: All-in-one Reasoning Model for Image and Video [CVPR 2026]
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation.
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perception and reasoning in VLMs.




