Building trust: Foundations of security, safety and transparency in AI

January 27, 2025Emily Fox, Huzaifa Sidhpurwala, Huamin Chen, Mark Bestavros, Garth Mollett6-minute read

This blog is an adaptation of a Red Hat research paper of the same name (Bestavros, Chen, Fox, Mollett & Sidhpurwala, 2024). You may access the full paper here.

As publicly available artificial intelligence (AI) models rapidly evolve, so do the potential security and safety implications, which calls for a greater understanding of their risks and vulnerabilities. To develop a foundation for standardized security, safety and transparency in the development and operation of AI models–as well as their open ecosystems and communities–we must change how we’re approaching current challenges, such as consistent information about models, lack of distinction between security and safety issues and deficient and non-standardized safety evaluations available and in use by model makers.

Risks and vulnerabilities

While similar, AI security and AI safety are distinct aspects of managing risks in AI systems. AI security protects the systems from external and internal threats, while AI safety provides confidence that the system and data don’t threaten or harm users, society or the environment due to the model’s operation, training or use. However, the relationship between AI security and safety is often blurry.

An attack that would typically be considered a security concern can lead to safety issues (or vice versa), such as the model producing toxic or harmful content or exposing personal information. The intersection of AI security and safety highlights the critical need for a comprehensive approach to AI risk management that addresses both security and safety concerns in tandem.

Current challenges and trends

While the AI industry has taken steps to address security and safety issues, several key challenges remain, like the prioritization of speed over safety, inadequate governance and deficient reporting practices. Emerging trends suggest that targeting these areas of growth are crucial for developing effective safety, security and transparent practices in AI.

Speed over safety

In the spirit of developing and deploying AI technologies quickly to “secure” increased market share, many organizations are prioritizing quickening their pace to market over safety testing and ethical considerations. As seen via past security incidents, security is often years behind nascent technology, typically leading to a major incident before the industry begins to self-correct. It’s reasonable to predict that in the absence of individuals pushing for risk management in AI, we may experience a significant and critical safety and security incident. While new models are being introduced with security and safety in mind, the lack of consensus around how to convey the necessary safety and transparency information makes them challenging to evaluate, though the increase in safety-conscious models is a positive step forward for the AI industry.

Governance and self-regulation

With very little government legislation in effect, the AI industry has relied upon voluntary self-regulation and non-binding ethical guidelines, which have proven to be insufficient in addressing security and safety concerns. Additionally, proposed legislation often doesn’t align with the realities of the technology industry or concerns raised by industry leaders and communities, while corporate AI initiatives can fail to address structural issues or provide meaningful accountability as a result of being developed especially for their own use.

Self-governance has had limited success and tends to involve a defined set of best practices implemented independent of primary feature development. As seen historically across industries, prioritizing security at the expense of capability is often a trade off stakeholders are unwilling to make. AI further complicates this by extending this challenge to include direct impacts to safety.

Deficient reporting practices

As the industry currently stands, there is a lack of common methods and practices in handling user-reported model flaws. This is partially due to the fact that the industry’s flawed-yet-functional disclosure and reporting system for software vulnerabilities isn’t an apples-to-apples solution for reporting in AI. AI is a technical evolution of data science and machine learning (ML), distinct from traditional software engineering and technology development due to its focus on data and math and less on building systems for users that have established methodologies for threat modeling, user interaction and system security. Without a well understood disclosure and reporting system for safety hazards, reporting an issue by directly reaching out to the model maker may be cumbersome and unrealistic. Without a well understood, standardized reporting process, the impact of an AI safety incident could potentially be far more egregious than it should be, due to delayed coordination and resolution..

Solutions and strategies

Heavily drawing upon prior work by Cattel, Ghosh & Kaffee (2024), we believe that extending model/system cards and hazard tracking are vital to the improvement of security and safety in the AI industry.

Extending model/safety cards

Model cards are used to document the possible use of an AI model, as well as its architecture and occasionally the training data used for the model. Model cards are currently used to provide an initial set of human-generated material about the model that’s then used to assess its viability, but model cards could have more potential and applicability beyond their current usage, regardless of where they travel or where they’re deployed.

To effectively compare models, adopters and engineers need a consistent set of fields and content present on the card, which can be accomplished through specification. In addition to the fields recommended by Barnes, Gebru, Hutchinson, Mitchell, Raji, Spitzer, Vasserman, Wu & Zaldivar, 2019, we propose the following changes and additions:

Expanding intent and use to describe the users (who) and use cases (what) of the model, as well as how the model is to be used.
Add scope to exclude known issues that the model producer doesn’t intend or have the ability to resolve. This will ensure that hazard reporters understand the purpose of the model before reporting a concern that’s noted as unaddressable against its defined use.
Adjust evaluation data to provide a nested structure to convey if a framework was also used, and the evaluation’s outputs that were run on the model. Standardized safety evaluations would enable a skilled user to build a sustainably equivalent model.
Add governance information about the model to understand how an adopter or consumer can engage with the model makers or understand how it was produced.
Provide optional references, such as artifacts and other content, to help potential consumers understand the model’s operation and demonstrate the maturity and professionalism of a given model.

Requiring these fields for model cards allows the industry to begin establishing content that is essential for reasoning, decision making and reproducing models. By developing an industry standard for model cards, we will be able to promote interoperability of models and their metadata across ecosystems.

Hazard tracking

While the common vulnerability disclosure process used to track security flaws is effective in traditional software security, its application in AI systems faces several challenges. For one, ML model issues must satisfy statistical validity thresholds. This means that any issues or problems identified in an AI model, such as biases, must be measured and evaluated against established statistical standards to ensure that they’re meaningful and significant. Secondly, concerns related to trustworthiness and bias often extend beyond the scope of security vulnerabilities and may not align with the accepted definition. Recognizing these limitations, we believe that expanding the ecosystem with a centralized, neutral coordinated hazard disclosure and exposure committee and a common flaws and exposure (CFE) number could satisfy these concerns. This is similar to how CVE was launched in 1999 by MITRE to identify and categorize vulnerabilities in software and firmware.

Users who discover safety issues are expected to coordinate with the model providers to triage and further analyze the issue. Once the issue is established as a safety hazard, the committee assigns a CFE number. Model makers and distributors can also request CFE numbers to track safety hazards they find in their own models. The coordinated hazard disclosure and exposure committee is the custodian of CFE numbers and is responsible for assigning them to safety hazards, tracking them and publishing them. Additionally, the formation of an adjunct panel will be responsible for facilitating the resolution of contested safety hazards.

What next?

Models developed according to open source principles have the potential to play a significant role in the future of AI. The frameworks and tools that are necessary for developing and managing models against industry and consumer expectations require openness and consistency in order for organizations to reasonably assess risk. With more transparency and access to critical functionality, the greater our ability to discover, track and resolve safety and security hazards before they have widespread impact. Our proposals intend to afford flexibility and consistency through existing governance, workflows and structure. When implemented, they could provide more efficient avenues to resolving the pressing need to effectively manage AI safety.

About the authors

Emily Fox

Security Lead, Emerging Technologies

Emily Fox is a DevOps enthusiast, security unicorn, and advocate for Women in Technology. She promotes the cross-pollination of development and security practices.

Read full bio

Huzaifa Sidhpurwala

Senior Principal Product Security Engineer - AI security, safety and trustworthiness

Huzaifa Sidhpurwala is a Senior Principal Product Security Engineer - AI security, safety and trustworthiness, working for Red Hat Product Security Team.

Read full bio

Huamin Chen

Senior Principal Software Engineer

Dr. Huamin Chen is a Senior Principal Software Engineer at Red Hat's CTO office. He is one of the founding members of Kubernetes SIG Storage, member of Ceph, Knative and Rook. He co-founded the Kepler project and drives community efforts for Cloud Native Sustainability.

Read full bio

Mark Bestavros

Senior Supply Chain Engineer

Mark Bestavros is a Senior Software Engineer at Red Hat. In his six years at the company, Mark has contributed to a wide variety of projects in the software supply chain security space, including Sigstore, Keylime, Enterprise Contract, and more. Currently, Mark is actively contributing to the InstructLab project, working to apply traditional software supply chain security techniques to the rapidly-evolving AI space. Mark graduated from Boston University in 2019 with a combined BA/MS in Computer Science.

Read full bio

Garth Mollett

Product Security Lead Architect, Product Security Leadership Team (Global)

With over 25 years of experience in the technology industry, Garth has dedicated more than a decade to Red Hat, where as part of the Product Security leadership team he plays a pivotal role in defining the companies product security strategy and capabilities.

Garth is the author of Red Hat’s security guiding principles and is responsible for delivering the companies annual Product Security Risk Report.

Read full bio

Browse by channel

Explore all channels

Select a language

Building trust: Foundations of security, safety and transparency in AI

Risks and vulnerabilities

Current challenges and trends

Speed over safety

Governance and self-regulation

Deficient reporting practices

Solutions and strategies

Extending model/safety cards

Hazard tracking

What next?

About the authors

Emily Fox

Huzaifa Sidhpurwala

Huamin Chen

Mark Bestavros

Garth Mollett

More like this

Browse by channel

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links