Evan Boehs website Mastodon PGP Key email A drawing of an astronaut in space United States is running out of storage space

Npm install everything, and the complete and utter chaos that follows

in
blog
tags
date
1/5/2024

We are hackers, but not the steal all your data type. We are hackers in the best sense, in the sense that we are creatives who enjoy a good challenge, and aren’t afraid to get our hands dirty. To quote Wikipedia:

The hacker culture is a subculture of individuals who enjoy—often in collective effort—the intellectual challenge of creatively overcoming the limitations of software systems or electronic hardware, to achieve novel and clever outcomes.

And that is exactly what we did. Our particular challenge? Install every single package on NPM with one artful command. Our first attempt was incredibly simple: create a package, add every other package to it’s dependency list, and hit install. Instantly, we were met with our first challenge: you can only directly depend on 800 packages – a long shot from the 2.5 million on the registry. Rats. But it wouldn’t be a satisfactory conclusion if we called it there, I know that I’ve installed a whole lot more than 800 packages in one command. What if we created a chunking system, subdividing our package into thousands of smaller pieces? We gave it a try, and it seemingly worked: time to share with the world.

We began uploading our monstrosity to the registry. We were careful to scope all our packages into our own namespace to prevent disruption, and we created a small JavaScript file to be executed at the end:

console.log("You have installed everything… but at what cost?")

In the process of uploading, we faced little opposition. We were allowed to upload 500 chunks in one go by NPM, a very generous number, so we ran our script intermittently over the span of a day. The total size of all the package.json files was 82mb. We then rented a large VM, and began a live stream attempt to execute npm install everything. NPM hanged, PNPM did not even attempt it, and Yarn tried its very hardest, making it a hour before giving out. We considered this a wonderful first attempt, and began discussing the possibility of creating our own makeshift package manager to finish the job.

The whole process of this was very exciting to me, I was immensely curious what the VM, after finally executing all the post install scripts, would look like. We also began developing a website for our project, so that when it all was over we might have some artifact to share with the world. I think a lot of people can relate to the immense joy of creation, and we were certainly basking in it. We were relatively confident that we had done our due diligence, so that our little tomfoolery would not bother anybody. We were so very wrong.

The registry, and it’s integrity

A few years ago now, the developer of a package named kik, who had a long running connection to NPM and the open source community, was shafted by NPM and the company named Kik, who wished to use the package. He fought for the package, but ultimately the rules governing the registry at the time were abandoned, and Kik was granted the package name. The developer decided to unpublish all of their packages out of protest, including one named left-pad. Left-pad was a small little package, that did the exact thing it says on the tin – pad the left side of a string, in a wee 17 lines of code. Unfortunately, through a large dependency chain, the removal of this code would wreck havoc on the JavaScript ecosystem, and would spur serious discussions on the feasibility of open source, and how to maintain trust in a supply chain controlled by so many.

NPM, with good reason, also began a process of introspection. “How can this sort of event that makes our customers angry be prevented in the future?”, they asked themselves. Their answer was a new rule: any versions of a package that has dependents cannot be removed from the registry. It also just so happens that when you write * you depend on every version of a package in NPM’s eyes. If package a depends on the “*” version of package b, package b is essentially rendered immutable (it’s my belief that such an ambiguity is very problematic, and star versions should be at the very least ignored when it comes to unpublishing, or removed entirely as the fallout of faker showed). We hadn’t been aware of this issue when we uploaded our packages, but this ignorance ended very, very quickly.

The registry, and the community that relies on it

Just a few hours after our last package hit the site, the first issue was filed. It acutely explained that we had rendered them incapable of removing their own package, which formerly had zero dependents. We instantly halted our celebrations, and tried to get to the root of the problem. We knew this was big trouble, but for now, we just wanted to help them delete their package. From a ski chairlift, I drafted a solution… we could create a service that allowed people to remove their packages from our chunks, and then upload a new version. This, I would quickly learn, is impossible, because the previous version would still exist.

At this point, we knew it had punched well above our weight, and attempted to take our own package down, in an effort to prevent further harm we might inadvertently cause. Only problem is, in attempting to install and by extension, depend on everything, we depended on our self.

Immediately, we reached out to NPM support explaining the situation: that we had accidentally rendered all packages unremovable, and that we believed this was a significant flaw in NPM policy that could cause future issues, from hackers like us and bad actors alike. Unfortunately, this issue happened around Christmas time, and it seemed as if NPM was out of office. As a last Hail Mary, we reached out to both NPM’s security team, and contacts at GitHub. We are yet to hear a response through any of these avenues. What we did hear from was the community:

The registry, and the community we share

Over the 5 days after we began awaiting a response from NPM, we did hear from a rightfully frustrated community. At every chance we got, we rehashed the following points:

  1. We intended no harm, we did not know this would happen.
  2. We are very, very sorry for whatever damage this might have caused you.
  3. We tried everything in our power resolve this issue as soon as we discovered it, but all we can do now is wait.
  4. It’s our belief that the policy we broke is flawed, and should be revised going forward to avoid similar issues to the one we accidentally created.

Many forgave us, and for that I am eternally grateful. Some others were very frustrated, and voiced their frustration through vulgar language and personal insults. I understand this frustration completely, but I don’t believe it is constructive in any form. The analogy I will use is as follows:

We tried to hang a pretty picture on a wall, but accidentally opened a small hole. This hole caused the entire building to collapse. While we did not intend to create a hole, and feel terrible for all the people impacted by the collapse, we believe it’s also worth investigating what failures of compliance testing & building design could allow such a small hole to cause such big damage.

Multiple parties involved, myself included, are still students and/or do not code professionally. How could we have been allowed to do this by accident? Is it acceptable to say these things to a high school student? Does our little hack deserve all this hate, all this blame? Imagine how scared shitless I, still young and inexperienced, must have been when I received an email from Laura French, a news reporter previously employed by USA TODAY, CBS, and Vanity Fair. She requested my comment on the catastrophe we caused from our small experiment, and I was terrifed knowing that the words I used to defend our once little creative project could very easily be twisted.

Thankfully, this did not happen, and I find myself eternally grateful to Laura. Waiting for the article to drop was nerve wracking, but we all blew a sigh of relief when we saw the work she produced. It was remarkably neutral, and took the opinions of both sides seriously. If Laura ever reads this, please know I have nothing but respect for you.

At last, the registry speaks

After Laura released her article, making numerous links to comments made in our repository, action was finally taken. Our repository disappeared.

In one way, this is good. It means the storms have subsided, but it also means that an archival of all the things said to us, and all the things we said back, is in effect gone. This will prove to be important, as one day later, my friend got an email at the same time GitHub made a public comment on the matter:

GitHub support makes some pretty egregious claims here, let’s walk through them one by one:

Harassing other users of the Service is never tolerated, whether via public or private media.

We did not harass anybody, and we are essentially unable to prove it now. It’s now the word of them (a trillion dollar corporation) against us (a few random netizens with an average age of 20). Luckily, I have the full issue history saved in screenshots. These will be attached at the end of the article.

Next, they cite the GitHub policy on disruptive conduct:

We do not allow content or activity on GitHub that is off-topic, or interacts with platform features in a way that significantly or repeatedly disrupts the experience of other users

It feels like this is specifically referring to onsite conduct that impacts other GitHub users, and the document they link just makes that clearer, using examples like "Opening empty or meaningless issues or pull requests” and “Creating nonsensical or irrelevant code reviews”. Why then, is GitHub policy being used to enforce disruption that happened off site? Especially, might I add, when tools used explicitly with the intent of disruption are freely allowed on GitHub’s platform:

https://github.com/MatrixTM/MHDDoS

The difference between us is as follows: We caused accidental disruption on properties owned by the same company. The large assortment of DDoS tools that are available on GitHub openly cause deliberate desruption on the properties of others. Would they have taken this same action if they didn’t own NPM? I doubt it.

The final reason they state is this:

Specifically, the content or activity that was reported included adding thousands of packages with the explicit purpose of preventing package unpublishing, which we found to be in violation of our policies

Which is libelous. We’ve been abundantly clear that we never had the intent of preventing package unpublishing from the very get-go and we reached out numerous times to NPM requesting they remove our package because it’s causing this very issue. This claim is outrageous.

GitHub has responded to Laura’s article, saying that the packages have been removed because “We found the project to be in violation of GitHub’s Acceptable Use Policies”, despite never naming a single rule we broke that didn’t involve blatantly false information. They have not, to this day, responded to or even acknowledged our numerous futile attempts to contact them.

Conclusion

I’ve learned a lot of valuable things from all this. I’ve experienced my first moment in the public eye, I’ve experienced first hand the dangers of jumping to conclusions, and I’ve certainly become abnormally familiar with NPM internals. I’m also faced with a great deal of unanswered questions regarding the open source community and my role in it. I love creative projects, the sorta things we do just because we can, just to satisfy some little nudge in the back of our minds. Now knowing the damage those little impulses can cause, however, makes me scared to follow them again.

We may never know what would have happened, should our VM based attempt have succeeded. That itself is a little frustrating, but obviously, I think we’re all a little afraid to touch this ever again. I suppose, I’ve also learned when to give up and move on.

Refs

/node/npm-everything.html