Generative AI

Credo AI’s Reflections on How AI Systems Behave, and Who Should Decide

How should AI systems behave, and who should decide?

February 27, 2023
Ian Eisenberg
No items found.
No items found.

How should AI systems behave, and who should decide? OpenAI posed these critical questions in a recent post outlining their future strategy. At Credo AI, our focus is AI Governance, a field concerned with these same questions! Given the importance of the increasingly general AI models, including “Generative AI” systems and “Foundation Models”, we believe it is important to communicate our thoughts on these weighty questions.

Key Takeaways:

  • The safety and trust of powerful AI systems is enhanced by redundant and independent safeguards.
  • The specific balance of personal freedom vs. societal safeguards should be contextual for AI systems, as it is for other aspects of society.
  • Societal bounds on transformative technologies should be defined by a diversity of actors, who create regulations, standards and soft-laws relevant for individual domains.
  • Leading AI organizations should support the rapid iteration and development of good societal bounds by sharing knowledge about AI system uses and misuses.

How should AI systems behave, and who should decide?

OpenAI’s answer to the large questions “how should AI systems behave and who should decide,” is that individuals should have the power to reflect their own values in AI systems within broad societal bounds. Of course, the details of what those bounds are and how exactly individuals are empowered to customize the AI systems are critical in evaluating this approach. OpenAI discusses soliciting public input on behavioral bounds and hints that users will have more control over the safety filters. The latter will likely entail a greater ability to turn off or customize safety filters currently in use for models like ChatGPT. 

We are eager to see how these details develop, just as we are following Anthropic’s constitutional AI work, Deepmind’s work on AI alignment, the community-building efforts and transparency work of HuggingFace and so many other developments. However, we also believe that a comprehensive answer to these questions requires developments that extend beyond the agency of for-profit AI companies alone. Academic initiatives like Stanford HAI, nonprofits like GovAI and the Centre for AI Safety, and the growing ecosystem of invested AI stakeholders including governments, policymakers, standard-setting bodies, industry-specific experts and governance focused companies, all have roles to play. In the remainder of this piece, we articulate a vision of collaboration that promotes the responsible adoption and deployment of these new technologies.

Contextualized societal bounds

At the highest level, the idea that AI systems must respect a balance between individual control and broader societal bounds is a straightforward extension of how we treat any action in a society that respects multiple human values. Like any action that affects others, AI systems should balance personal freedom and societally beneficial safeguards. Given the generality of modern AI systems, it’s impossible to describe a single criterion that determines when individual freedom should be curtailed. Instead these criteria must be contextual and respect the circumstances in which the system is employed.

Societally-defined boundaries and values

Given that safeguards should be contextual, how do we define a set of rules that apply to a specific context? Underlying this question is a more fundamental one: how do we categorize contexts in a pragmatic way that supports standardized boundaries without ignoring important context-specific considerations? Thankfully, these are not new questions and we don’t have to reinvent the wheel for AI! Instead, we can take advantage of existing taxonomies which divide human endeavors into relevant “contexts” (e.g., based on industry, risk-level, etc.). 

Some institutions will define general AI standards which become part of AI developer’s cultural milieu and the basis for soft law. Deciding on these broad standards is a current focus of many organizations and leading AI companys should consult with the broader public when defining their own strategies. This is a good first step. Societally-defined bounds, however, can and should be adapted to more specific use-cases. For instance, laws governing how we allocate critical resources like jobs or financial loans are already subject to many regulations, which differ from safeguards on advertising. The challenge of answering “how should AI systems behave” is not necessarily one of inventing new principles or contexts, but rather figuring out how to apply existing expertise (in the form of industry-specific laws, standards, etc.) to AI systems. One can see this as an intermediate position between “broad societal boundaries” and “individual control”. The balance of freedom and societal boundaries is hierarchical, determined by the needs of the particular use-case and supported by a web of intersecting perspectives and expertise.

Said this way, governing AI systems seems very complex and burdensome! Leading AI organizations aren’t solely responsible for understanding and applying the diverse boundaries any more than they are responsible for developing the myriad inventive ways AI systems will be used. Indeed, the developers of foundational models likely aren’t best positioned to take on this mission, though they do have an important role to play. Which brings us to our next point.

Accelerated governance innovation through industry partnership

One difficulty with governing new AI systems is a lack of information regarding  how they behave and how they are used. This information issue is exacerbated by the fact that there is no single answer for how foundation models behave (though there are some emerging trends), AI systems are rapidly advancing, and the broader user base is actively exploring and inventing new uses. While it is challenging for AI companies to keep up with this blizzard of changes, doing so is virtually impossible for organizations multiple steps removed from AI system development. How should an expert on the ethics of hiring engage with AI systems if they have a limited understanding of the new technology?

While expertise gaps will always exist, we believe leading AI organizations have the important responsibility to enable other actors in society to define the best context-sensitive boundaries possible. This means education and information-sharing, where the latter is particularly important. What ways are models like ChatGPT being used? Are there emerging trends? Internally, how do AI companies characterize problematic use-cases and how do they evaluate their models? Publishing and sharing AI evaluation methodology is a fantastic step, but more can and needs to be done. We don’t believe AI companies are solely responsible for developing the societal knowledge necessary for appropriately governing AI systems (there is a large role for academia and non-profit here), but they certainly are well-positioned to help! With better information sharing, we believe the gap between “technological innovation” and “governance innovation” can be bridged. Governance needs to keep pace with AI development for use to unlock the benefits of this transformational technology for our society and economy.

AI Safety benefits from redundancy and independence

Even after contextualized societal bounds are established, their operationalization is its own technical and procedural challenge. As the developers of foundation models, companies like OpenAI are critical to ensure models are available that are broadly alignable. Whether through fine-tuning, prompt engineering or other methods, downstream users should be able to responsibly mold AI system behavior towards the goals of the use-case. 

Nevertheless, foundation model developers are not necessarily incentive-aligned with a broader societal movement to greater governance, and expertise in the development of AI systems is not the same as expertise in their beneficial use. Beyond these reasons, a simple appeal to the idea of safeguard redundancy in critical technological infrastructure points to independent safety-focused organizations playing a role in deploying powerful AI systems. We believe the final operationalization of “who should decide” how AI systems behave should be independent of the AI developer. In the future the safeguards that gate AI systems should  be defined by use-case specific institutions and applied through responsibility-focused technologies. This is a clear case where separations of concerns and expertise is a boon for the final system. Above and beyond the direct reduction of safety failures, a greater emphasis on safety will deserve and engender more trust, helping society reap the benefits of AI more broadly and quickly.


It is critical that leading AI organizations are focused on the challenge of aligning powerful systems with a diversity of values. That said, we strongly believe that answering “how AI systems should behave” requires a broader coalition of actors. As articulated in OpenAI’s charter, this is both necessary, due to the distribution of expertise throughout the world, and desired, to avoid the concentration of power in too few hands.. The creators of powerful foundation models have different expertise than those defining their context-sensitive beneficial use, and different incentives than those focused solely on ensuring value alignment. These actors cannot achieve the joint mission of creating beneficial AI systems for humanity on their own, and we encourage each member of the ecosystem to consider how they can support the missions of their partners.

DISCLAIMER. The information we provide here is for informational purposes only and is not intended in any way to represent legal advice or a legal opinion that you can rely on. It is your sole responsibility to consult an attorney to resolve any legal issues related to this information.