Combine multiple reward signals (correctness, efficiency, style, safety) into a single scalar
Installs as a private draft. Your edits and self-improvement runs do not change the published bundle.
This is the published source version. Installing it creates a private copy in your workspace where you can edit, run experiments, and iterate without changing the public original.