AI Governance 101

Advancing AI Safety: Credo AI Supports MLCommons to Set New v0.5 Benchmark

Credo AI is thrilled to support MLCommons AI Safety working group's release of its v0.5 AI Safety benchmark proof-of-concept.

April 16, 2024
Ian Eisenberg
No items found.
No items found.

Credo AI is thrilled to support MLCommons AI Safety working group's release of its v0.5 AI Safety benchmark proof-of-concept. This milestone represents a significant step towards developing standardized AI safety benchmarks for various use cases, which will help guide responsible AI development across the industry.

We strongly believe in the necessity of improved evaluations and benchmarks for AI systems. While evaluations are critical throughout the AI development lifecycle, industry-wide benchmarks play a special role in establishing trust, driving innovation and improving communication across a complex supply chain. Many of the most widely used benchmarks today are focused on aspects of performance, rather than safety, and are rapidly becoming stale as AI capabilities advance. We clearly need new benchmarks focused on the risks of generative AI models. 

The MLCommons AI Safety Benchmark

MLCommons AI Safety benchmark contributes to this critical mission by establishing a broad-based benchmark focused on hazards caused by LLMs. The MLCommons AI Safety v0.5 proof-of-concept, built by an open consortium of industry, academia, and civil society experts, demonstrates the technical progress made to date on the MLCommons AI Safety benchmarking effort. It is being shared with the community to validate the approach and gather feedback for a comprehensive v1.0 product, set to be released later this year. Read more about the benchmark at the MLCommons announcement. 

By collaborating with MLCommons and contributing its expertise in AI governance and risk management, Credo AI aims to support the development of robust, credible, and practical AI safety benchmarks. These benchmarks will become a vital component of an overall approach to AI safety, aligning with responsible development and risk-based policy frameworks, such as the voluntary commitments made by companies to the White House, NIST's AI Risk Management Framework, and the EU AI Act.

Connecting evaluations with governance

AI governance is an oversight and coordination function deeply intertwined with the measurement of AI system behavior. Evaluations are critical inputs to governance, but their effective use is often hindered by a lack of direction and context. Benchmarks, like the MLCommons AI Safety Benchmark, provide a reference point and comparison, increasing the utility of evaluations and serving as a starting point for marketplace-wide standards. They incentivize improvements in alignment with the benchmark's focus areas, enabling organizations to manage AI systems responsibly.

Credo AI's Governance Platform complements the MLCommons AI Safety Benchmark by providing companies with the scalable oversight and coordination needed to incorporate these benchmarks into their overall AI strategy. As the evaluation and benchmarking landscape evolves, Credo AI's Policy Intelligence helps organizations ensure that they are directly benefiting from these tools, using them where and when they are needed to support responsible AI development and deployment.

Credo AI remains committed to collaborating with MLCommons and other industry partners to advance the science of AI safety. By supporting the development of standardized benchmarks like the MLCommons AI Safety Benchmark and providing complementary governance solutions, Credo AI aims to ensure that AI technologies are developed and deployed in a secure, fair, and transparent manner. We encourage the AI community to join the MLCommons AI Safety working group and provide feedback on their approach, contributing to the collective effort to shape the future of responsible AI.

  • 📚 Learn AI governance with our AI Governance Academy!
  • 💌 Subscribe to our monthly newsletter to keep up to date with the latest advancements in the AI industry.
  • ☎️ Ready to take the next step in your AI governance journey? Talk to our expert team.

DISCLAIMER. The information we provide here is for informational purposes only and is not intended in any way to represent legal advice or a legal opinion that you can rely on. It is your sole responsibility to consult an attorney to resolve any legal issues related to this information.