Forge

JSON →
stdio

GPU kernel optimization - 32 swarm agents turn PyTorch into fast CUDA/Triton kernels on real datacenter GPUs with up to 14x speedup

npx -y @rightnow/forge-mcp-server

★ 13 GitHub stars