Product

Credo AI Lens™: the ultimate open-source framework for Responsible AI assessments

Fabrizio Puletti
Sr Data Scientist at Credo AI
January 5, 2023
1/5/2023
Contributor(s):
Ian Eisenberg
Catharina Doria

Machine learning is increasingly interwoven into every industry. While generative AI systems have captured the lion's share of recent interest, a subtler shift has taken place over the last decade: the growth of a vast machine learning (ML) infrastructure informing the decisions of many industries and organizations. This infrastructure comprises ML systems of all types—from the most advanced generative systems already mentioned to the humble logistic regression. 

While ML capabilities have developed at a staggering pace, guidelines and processes to mitigate ML risks have lagged behind. That’s where AI Governance comes in, which defines the policies and processes needed to safeguard AI systems. While there are many components to successfully operationalize the responsible development of AI systems, a chief need is assessing AI systems to evaluate whether they behave adequately for their intended purpose. This assessment challenge is central to Credo AI’s efforts to develop AI Governance tools. In this blog, we introduce Credo AI Lens, our open-source assessment framework built to support the assessment needs of your AI Governance process.

What is Credo AI Lens™ (Lens)?

Lens is an open source python package whose purpose is to provide a jack-of-all-trades tool for AI assessment. Lens focuses on the dimensions of the AI systems necessary for governance, including how the system performs, its fairness characteristics, characteristics of the dataset, etc. We’ll explore each of these abilities in turn later. 

First, we’ll describe the characteristics we believe are critical for any assessment tool: transparency, adaptability, perspective, and connectivity. Each of these characteristics is reflected in how Lens was developed. 

1) Transparency

No trustworthy assessment tool, particularly in the realm of Responsible AI, should exist without full transparency. The procedures applied to data and models—their origin and implementation—must be fully available for inspection by any interested party. For this reason, we decided to create Lens as an open-source project.

This enables: 

  1. Correctness: any methodological error is more likely to be discovered.
  2. Trust: any party using the tool can explore the code base and be satisfied that the assessments meet their expected requirements.
  3. Community: any interested user can contribute to the package by implementing new assessments or expanding existing ones.
2) Adaptability

The landscape of AI assessment tools is constantly evolving. Numerous open source packages currently exist that explore dataset and model characteristics along Responsible AI dimensions like fairness, security, privacy, and performance. Moreover, academic research in these fields is a thriving endeavor that continuously generates new concepts and frameworks. In short, measurement best practices are changing rapidly, and the challenge of assessing AI systems is far from being solved. This, combined with the rapid development of AI models themselves, requires an assessment framework to be highly adaptable.

To keep pace with this rapidly changing field, Lens needs to be able to incorporate any existing technology within a consistent ecosystem. Internally, Lens relies on a class of objects called evaluators

From a computer science perspective, evaluators are a type of design pattern called adapter, which allows objects with incompatible interfaces to work together. Within the Lens ecosystem, evaluators allow us to bring any useful AI assessment tool into Lens and have it working seamlessly. This means that Lens can incorporate new approaches and libraries, thus keeping up with the latest innovations within the AI assessment field.

3) Perspective 

Given the vast field of AI assessment, Lens can only represent a subset of the existing tools/technologies. In this sense, Lens provides a continuously curated selection of methods to assess datasets and models. This curation represents Credo AI’s point of view on which assessments are currently the most useful or sought out.

This is particularly useful for users who want to approach Responsible AI assessments but have little prior experience. Lens provides an intuitive taxonomy of assessments and can ingest a wide variety of datasets and models. This makes running a full suite of assessments fairly trivial.

4) Connectivity

At Credo AI, we are building an AI Governance Platform that ensures an AI system’s compliance, fairness, transparency, and auditability. Central to the task is the capability of running technical assessments of the AI system. Therefore, Lens is the core tool used for these assessments.

When combined with credoai-connect and Credo AI Platform, Lens capabilities are expanded to include:

  1. Customization of the set of assessments to be run.
    This is encoded in a policy pack defined within Credo AI Platform. The policy pack contains all the instructions that Lens requires to run a suite of assessments.
  2. Programmatic run of the assessments.
    The user can simply reference the policy pack directly within Lens rather than manually specifying which assessment to run. This further simplifies an already simple pipeline definition.
  3. Generation of Reports. The standardized output of Lens evaluations can be exported to the platform with a single command. This information can be used in the platform to monitor your AI system and build auditable reports.

What can Lens do?

Lens has a number of evaluators that cover different dimensions of AI assessment. Particular focus is paid to dimensions that are components of an effective AI governance strategy. A list of the official assessments currently supported by Lens can be found below: 

  1. Equity: assessment of the equality of the outcomes across a sensitive feature*.
  2. Fairness: assess how a sensitive feature* relates to other features in the dataset (i.e., proxy detection) and how model performance varies based on the sensitive feature.
  3. Performance: assesses model performance according to user-specified metrics and disaggregates the information across sensitive features.
  4. Explainability: assessment of the feature importance for a model based on SHAP values.
  5. Data Profiling: provides descriptive statistics about a dataset.

* “sensitive feature” typically refers to an individual's characteristics or attributes related to their identity, such as race, gender, age, or sexual orientation. Sensitive features are often protected by laws and regulations that aim to prevent discrimination or unfair treatment based on these characteristics.

Lens also includes several experimental features. Please check our official documentation page for a full overview of Lens capabilities. Below is a basic code example; the goal is to evaluate a classification model along the dimensions of Fairness and Performance based on a set of metrics.

How to use Lens in Python for AI Assessment

Installing Lens is easy through pip. See the setup documentation for directions. We encourage the interested reader to run the quickstart demo to get started with Lens. We will briefly summarize its usage below.

After the typical ML workflow, where you prepare a dataset and train a model, you are ready to use Lens. Using Lens comes down to doing four things:

  • Wrapping ML artifacts (like models and data) in Lens objects
  • Initializing an instance of Lens. Lens is the main object that performs evaluations. Under the hood, it creates a pipeline of evaluations that are run.
  • Adding evaluators to Lens.
  • Running Lens. 
# Wrap ML Artifacts in Lens objects
# These ML Artifacts would have been created earlier by a typical
# ML pipeline. We are using the test data to perform our assessments.
credo model ClassificationMode1 (name="credit_default_classifier"
model like=model)
    credo data = TabularData
    name="UCI-credit-default",
    X-X_test,
    y=y_test,
    sensitive_features=sensitive_features_test,
)

# Initialization of the Lens object
lens = Lens (model=credo _model, assessment_data=credo_data)

# initialize the evaluators and add it to Lens
# In this case, we are evaluating the model for Performance
# and Fairness using the defined metrics.
metrics = ['precision_score', 'recall_score', 'equal_opportunity']
lens.add(ModelFairness(metrics=metrics))
lens.add(Performance (metrics=metrics))

# run Lens
lens.run()

As exemplified above, only a few lines of code are enough for Lens to generate multiple assessments on models and datasets.

The connection between Lens and AI Governance

The value of AI assessment is truly realized within a comprehensive AI governance process. While working in unison with Credo AI Platform, Lens can translate a list of governance requirements previously defined on the Platform into a set of assessments on models and data. The assessment results—or “evidence” in Credo AI parlance—can then be exported to the Platform, where they are tracked and translated into standardized reports.

At Credo AI, we aim to empower companies to deliver Responsible AI at scale via our AI Governance software Platform—a multi-stakeholder SaaS platform for managing AI risk and compliance at scale. Our AI Governance Platform supports organizations with three action pillars. First, it provides context by examining guardrails within the ecosystem, including regulations, standards, company policies, and industry best practices. Second, it promotes continuous oversight and accountability by testing data sets and models against guardrails. Finally, it converts outputs into governance artifacts to increase transparency among stakeholders. 

Integrations from the AI Governance Platform with Credo AI Lens make it easy for any technical team to assess models and datasets for RAI considerations like fairness, explainability, performance, robustness, security, and privacy throughout the ML lifecycle.

Who is it for?

Lens is generally aimed at data scientists/analysts requiring a unique entry point to a set of varied, Responsible AI assessments. Within this category, three types of users differentiated based on responsible AI knowledge come to mind:

  1. Newcomer: a user approaching the field for the first time. Lens is particularly useful in this case because it provides a curated set of methodologies for Responsible AI assessments and an easy-to-use interface. Running a default set of assessments can be achieved with just a few lines of code. See the quickstart notebook for further details.
  2. Experienced: a user with substantial experience running AI systems assessments. Such a user can benefit from Lens’ taxonomy of evaluators while simultaneously having the possibility of fine-tuning the assessments to run. Given the open source nature of the package, and its exhaustive documentation, an experienced user can also decide to create their own evaluators.
  3. RAI team member: this is a user with varying levels of experience working in a more extensive responsible AI team. This technical user needs to assess an AI system and report their results in a standardized way that can be explored and understood by other non-technical team members. In combination with credoai-connect and Credo AI Platform, Lens provides the capability of running a custom-tailored suite of assessments and exporting the results to a platform for report generation.

Conclusion

As the jack of all trades for AI assessment, Lens positions itself at the heart of the Responsible AI life cycle. It allows for a customizable set of assessments to be run on a wide range of model frameworks and datasets. When used in conjunction with Credo AI SaaS Platform, it allows for the completion of the Responsible AI loop entailing:

  1. Alignment: the definition of requirements for a specific AI system.
  2. Assessment: the conversion of requirements into assessments (Lens purpose).
  3. Reporting: the generation of easy-to-understand documentation fully characterizing the AI system.

Join the movement to make
Responsible Al a reality