Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
109 published bundles ready to inspect and install
Systematically find ways an agent could game the reward function
Maintain and switch between agent versions when new RL training degrades performance
Monitor deployed RL-trained agents for performance drift, reward hacking in the wild, and distribution shift