Each skill bundle packages a reusable agent behavior — a prompt, supporting files, and evaluation criteria. Browse the public catalog, review the full source, then install a private copy you can edit and experiment with.
109 published bundles ready to inspect and install
Build evals for specialized verticals (legal, medical, finance, engineering)
Ensure training data and eval data don't overlap
Create evals specifically designed to find failure modes and edge cases