BB

BIG-bench

Evaluation·infrastructure·open·#615 of 884·+49·Rising

66.0

Low

High confidence

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Pillar Breakdown

Adoption

35%

73.3

Maintenance

30%

57.9

Friction

20%

94.0

Ecosystem

15%

44.4

Momentum

0.53Rising
7d change -0.24
High confidence

In Evaluation

Ranked #31 of 57

Similar Tools