by Alex Becker

Python Packages Since 2005

One of the cool things about building a PyPI mirror is having so much data about the Python ecosystem at my fingertips. I decided to explore how the ecosystem has been evolving since PyPI was created in 2003. Most of my analysis starts at 2005, since that's when PyPI added the upload_time field.

The Python ecosystem has been steadily growing throughout this period. After the first few years of hyper growth as PyPI gained near-full adoption in the Python community, the number of packages actively developed each year—meaning they had at least one release or new distribution uploaded—has increased 28% to 48% every year.

Active Python Packages 2005-2018

As this graph shows, the majority (~66%) of actively maintained packages each year are new , and the majority of those do not continue to be maintained. However, there is still steady and robust growth in the number of packages maintained for more than 1 year. Growth in releases has been even stronger, at 31% to 59% every year, although it has slowed down somewhat. This means that packages are getting more releases on average, which is a decent proxy for becoming more mature and better-maintained.

Releases per Year 2005-2018

The most surprising result I stumbled upon came from looking at the number of releases per package. Some of these I was expecting based on my personal experience upgrading dependencies frequently, such as AWS's botocore at #15. But the cryptocurrency trading library ccxt immediately stood out like a sore thumb. With 4659 releases at time of writing, it has more than 3 times as many releases as any other package—despite being less than 2 years old! Its page usually times out after 30 seconds when I try to load it. I am not sure whether this is excellent or terrible maintainership, but it is certainly impressive.

Another interesting thing to look at is how practices around distributing Python packages are changing. The biggest change was of course the release of Python 3. Binary wheels, introduced in 2012 and codified in PEP 427, are generally accepted as the best way to distribute Python packages. But adoption among package authors has taken time. Python Wheels tracks the adoption of wheels among PyPI's 360 most downloaded packages; at time of writing 82.5% of them include wheel distributions in their releases. The long tail of the other >60k packages is lagging a bit behind, but the percentage of all releases that include at least 1 wheel distribution is growing steadily and just crossed 50% in 2018.

Fraction of Releases with Wheels

Not every package will be distributed as a wheel; in particular psycopg2 will soon stop publishing wheels due to conflicts between the bundled LibSSL and the system's pre-existing LibSSL. But very few packages have such a reason not to be distributed as wheels, so I expect the rate of adoption to stay strong until 90% or more—which this graph suggests will happen by the end of 2022.