PyPI halted new users and projects while it fended off supply-chain attack

Supply-chain attacks, like the latest PyPI discovery, insert malicious code into seemingly functional software packages used by developers. They're becoming increasingly common. — Enlarge / Supply-chain attacks, like the latest PyPI discovery, insert malicious code into seemingly functional software packages used by developers. They’re becoming increasingly common.

Getty Images

PyPI, a vital repository for open source developers, temporarily halted new project creation and new user registration following an onslaught of package uploads that executed malicious code on any device that installed them. Ten hours later, it lifted the suspension.

Short for the Python Package Index, PyPI is the go-to source for apps and code libraries written in the Python programming language. Fortune 500 corporations and independent developers alike rely on the repository to obtain the latest versions of code needed to make their projects run. At a little after 7 pm PT on Wednesday, the site started displaying a banner message informing visitors that the site was temporarily suspending new project creation and new user registration. The message didn’t explain why or provide an estimate of when the suspension would be lifted.

Enlarge / Screenshot showing temporary suspension notification.

About 10 hours later, PyPI restored new project creation and new user registration. Once again, the site provided no reason for the 10-hour halt.

According to security firm Checkmarx, in the hours leading up to the closure, PyPI came under attack by users who likely used automated means to upload malicious packages that, when executed, infected user devices. The attackers used a technique known as typosquatting, which capitalizes on typos users make when entering the names of popular packages into command-line interfaces. By giving the malicious packages names that are similar to popular benign packages, the attackers count on their malicious packages being installed when someone mistakenly enters the wrong name.

“The threat actors target victims with Typosquatting attack technique using their CLI to install Python packages,” Checkmarx researchers Yehuda Gelb, Jossef Harush Kadouri, and Tzachi Zornstain wrote Thursday. “This is a multi-stage attack and the malicious payload aimed to steal crypto wallets, sensitive data from browsers (cookies, extensions data, etc.) and various credentials. In addition, the malicious payload employed a persistence mechanism to survive reboots.”

Enlarge / Screenshot showing some of the malicious packages found by Checkmarx.

The post said the malicious packages were “most likely created using automation” but didn’t elaborate. Attempts to reach PyPI officials for comment weren’t immediately successful. The package names mimicked those of popular packages and libraries such as Requests, Pillow, and Colorama.

The temporary suspension is only the latest event to highlight the increased threats confronting the software development ecosystem. Last month, researchers revealed an attack on open source code repository GitHub that was flooding the site with millions of packages containing obfuscated code that stole passwords and cryptocurrencies from developer devices. The malicious packages were clones of legitimate ones, making them hard to distinguish to the casual eye.

The party responsible automated a process that forked legitimate packages, meaning the source code was copied so developers could use it in an independent project that built on the original one. The result was millions of forks with names identical to the original ones. Inside the identical code was a malicious payload wrapped in multiple layers of obfuscation. While GitHub was able to remove most of the malicious packages quickly, the company wasn’t able to filter out all of them, leaving the site in a persistent loop of whack-a-mole.

Similar attacks are a fact of life for virtually all open source repositories, including npm pack picks and RubyGems.

Earlier this week, Checkmarx reported a separate supply-chain attack that also targeted Python developers. The actors in that attack cloned the Colorama tool, hid malicious code inside, and made it available for download on a fake mirror site with a typosquatted domain that mimicked the legitimate files.pythonhosted.org one. The attackers hijacked the accounts of popular developers, likely by stealing the authentication cookies they used. Then, they used the hijacked accounts to contribute malicious commits that included instructions to download the malicious Colorama clone. Checkmarx said it found evidence that some developers were successfully infected.

In Thursday’s post, the Checkmarx researchers reported:

The malicious code is located within each package’s setup.py file, enabling automatic execution upon installation.

In addition, the malicious payload employed a technique where the setup.py file contained obfuscated code that was encrypted using the Fernet encryption module. When the package was installed, the obfuscated code was automatically executed, triggering the malicious payload.

Upon execution, the malicious code within the setup.py file attempted to retrieve an additional payload from a remote server. The URL for the payload was dynamically constructed by appending the package name as a query parameter.

Enlarge / Screenshot of code creating dynamic URL.

The retrieved payload was also encrypted using the Fernet module. Once decrypted, the payload revealed an extensive info-stealer designed to harvest sensitive information from the victim’s machine.

The malicious payload also employed a persistence mechanism to ensure it remained active on the compromised system even after the initial execution.

Enlarge / Screenshot showing code that allows persistence.

Besides using typosquatting and a similar technique known as brandjacking to trick developers into installing malicious packages, threat actors also employ dependency confusion. The technique works by uploading malicious packages to public code repositories and giving them a name that’s identical to a package stored in the target developer’s internal repository that one or more of the developer’s apps depend on to work. Developers’ software management apps often favor external code libraries over internal ones, so they download and use the malicious package rather than the trusted one. In 2021, a researcher used a similar technique to successfully execute counterfeit code on networks belonging to Apple, Microsoft, Tesla, and dozens of other companies.

There are no sure-fire ways to guard against such attacks. Instead, it’s incumbent on developers to meticulously check and double-check packages before installing them, paying close attention to every letter in a name.

Source