Making Open Source AI Safe

Abstract: 

As practical AI systems become more capable and more general-purpose, aligning them with the intent of developers, users, and other stakeholders becomes increasingly important. The attempt to align a system is the attempt to prevent or mitigate potential harms the system might cause or assist, and insufficient alignment can lead to negative consequences of various sizes that depend on the capabilities of the system. Properly aligned systems prevent both system-initiated unintended behaviors and system abuse by users.

There are many different aspects to be aligned, and many different pitfalls to address, from disinformation to active manipulation to self-proliferation, and many more. In this talk, we address risk management methods for mapping and prioritizing harms to mitigate from the system or model, and ways to start considering how to perform those harm mitigations, framed from a context of implementing the NIST AI Risk Management Framework. We will review taxonomies of risks from sources including DeepMind and OpenAI, considerations for detecting and measuring risks, methods to prioritize risk mitigation efforts, and approaches to properly aligning a system so as to significantly mitigate those risks. We will particularly drill down on a prioritization analysis looking at the offensive capability of a system within a given domain versus its defensive capability in that domain.

Bio: 

Richard Mallah has provided leadership for responsible AI R&D in industry for two decades, has used open source tools for over three decades, and has aided civil society's efforts for safer AI for a decade. He currently wears hats as Principal AI Safety Strategist at Future of Life Institute, as Executive Director at the Center for AI Risk Management & Alignment, and as Senior Advisor to Lionheart Ventures. He holds a degree in Intelligent Systems from Columbia University.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google