Now You See Me, Now You Don’t: Using LLMs to Obfuscate Malicious JavaScript
1 min read
Summary
An increasingly popular way for malicious actors to distribute malware is through malicious JavaScript downloaded through phishing, exploit kits or weaponized documents.
Criminals use large language models (LLMs) to rewrite or obfuscate existing malware, making it harder to detect.
Obfuscation methods that use pre-existing off-the-shelf tools are easily detected because they have consistent, well-known transformations.
By prompting LLMs to perform transformations that look more natural, criminals can more effectively evade detection, as these transformations are more difficult to detect.
An algorithm was developed to use LLMs to rewrite malicious JavaScript code in a step-by-step fashion, continually applying these transformations to trick a static analysis model.
This allows the generation of a multitude of malicious code variants at scale without any manual intervention.
A new malicious JavaScript detection model was then trained on tens of thousands of samples of this LLM-generated code, and this new detection model is now deployed in production.