Anthropic is launching a new program to fund the creation of new benchmarks for better assessing AI model performance and its impact. In its blog post, Anthropic stated that it will offer grants to third-party organisations developing improved methods for evaluating advanced AI model capabilities.
Urging the AI research community to develop more rigorous benchmarks that address societal and security implications, Anthropic advocated for revising existing methodologies through new tools, infrastructure, and methods. Highlighting how they aim to develop an early warning system to identify and assess risks, it specifically called for tests to evaluate a model's ability to conduct cyberattacks, enhance weapons of mass destruction, and manipulate or deceive individuals.
Moreover, Anthropic also aims for its new program to support research into benchmarks and tasks that explore AI's potential in scientific study, multilingual communication, bias mitigation, and self-censorship of toxicity. In addition to grants, researchers will have the chance to consult with the company's domain experts. The company also expressed interest in potentially investing in or acquiring the most promising projects, offering various 'funding options tailored to the needs and stage of each project'.
Benchmarks are the process of evaluating the quality of an AI system. The evaluation is typically a fixed process of assessing the capability of an AI model, usually in one area, while AI models like Anthropic's Claude and Open AI's ChatGPT are designed to perform a host of tasks. Thus, developing robust and reliable model evaluations is complex and is riddled with challenges. Anthropic's initiative to support new AI benchmarks is commendable, with their stated objective of the program serving as a catalyst for progress towards a future where comprehensive AI evaluation is an industry-standard. However, given their own commercial interests, the initiative may raise trust concerns.
,