AI-generated code could be a disaster for the software supply chain. Here’s why.
1 min read
Summary
A study using 16 of the most widely used large language models found that 576,000 of the model-generated code samples contained dependencies referencing non-existent packages.
Open-source models contained the most, at 21% of all their dependencies being non-existent.
These “hallucinated” dependencies could leave code open to so-called dependency confusion attacks, where a malicious actor publishes a poison package with the same name as a legitimate package that a piece of software relies on.
The attacker’s package will likely be chosen by the software, which will then run the payload on the user’s system if they do not carefully verify package origins.
These forms of attacks were demonstrated in 2021, impacting the networks of Apple, Microsoft, and Tesla, among others.