Summary

  • A study using 16 of the most widely used large language models found that 576,000 of the model-generated code samples contained dependencies referencing non-existent packages.
  • Open-source models contained the most, at 21% of all their dependencies being non-existent.
  • These “hallucinated” dependencies could leave code open to so-called dependency confusion attacks, where a malicious actor publishes a poison package with the same name as a legitimate package that a piece of software relies on.
  • The attacker’s package will likely be chosen by the software, which will then run the payload on the user’s system if they do not carefully verify package origins.
  • These forms of attacks were demonstrated in 2021, impacting the networks of Apple, Microsoft, and Tesla, among others.

By Dan Goodin

Original Article