ManimBench V0.5 public suite Open benchmark for ManimCE animation quality
Suite
V0.5 public
Models
0
Updated
Not yet
Top score
n/a

V0.5 engine release
six refreshed ManimCE tasks

  • OpenRouter generation
  • Cursor Composer route
  • Source checks
  • Parallel renders
  • Immutable manifests
  • Draft/live publish
V0.5 public0Not yet

Leaderboard

#ModelScoreVisualCostTimeOutput tokensReview

Overall ManimBench score

Higher is better. Automated checks plus optional visual review.

V0.5 is active

The engine now uses the V0.5 suite by default. Generation, rendering, reports, and publishing are wired through a stable orchestrator API.

  • Generation. OpenRouter handles public API models. Composer 2.5 runs through Cursor Agent CLI because OpenRouter does not publish a Composer slug.
  • Safety. Complete outputs are skipped unless forced. Checkpoint state and JSONL generation logs live under .manimbench/runs/.
  • Publishing. Draft and live site updates are built as full bundles, then committed and pushed once.

Run V0.5

Generate one ManimCE file per task, render the output folder, then publish the report bundle when the run is complete.

  1. python -m pip install -e ".[dev,render]"
  2. manimbench generate-batch --models gpt-5-5,opus-4-8 --smoke
  3. manimbench generate --model composer-2-5 --provider cursor
  4. manimbench run-file-matrix --model-output composer-2-5=outputs/composer-2-5 --parallel 4
  5. manimbench report --run-dir runs/<run_id>

Each output file must define one ManimCE MainScene. The default suite expects coordinate_system_animation.py, derivative_motion_story.py, matrix_transformation_grid.py, geometric_area_proof.py, probability_distribution_simulation.py, and fourier_series_decomposition.py.