Better web

Big publishers think libraries are the enemy

The recent Second Circuit decision in Hachette v. Internet Archive is only the latest battle in the war on libraries and the freedom to read.

Molly White

September 12th 2024, 3:40 pm — 18 min read

Big publishers think libraries are the enemy

0:00

/1543.967347

Listen to a voiceover of this post, subscribe to the feed in your podcast app, or download the recording for later.

I’ve seen quips to the effect of “if public libraries were invented today, they’d be outlawed.” The joke is increasingly becoming reality, most recently thanks to a decision in the Second Circuit Court of Appeals.

Particularly in a country where we’re seeing rapidly intensifying campaigns against books, libraries, and librarians, I am extremely concerned by an outcome that not only imposes further limits on how libraries can provide books to the people who need them, but seems to view libraries as detrimental to society. We must fight to protect our rights to read freely, and fight back against the censorship, surveillance, and rent-seeking that publishers and book distribution platforms have been working to not only normalize, but protect by law.

My beliefs are simple, and hardly radical: Libraries are critical infrastructure. Access to information is a human right. When you buy a book you should truly own it. When a library buys a book, they should be able to lend it. Readers should be able to read without any third parties spying over their shoulders, or preventing them from accessing the materials they have legally obtained.

Linocut relief print, in black ink, of a person reaching for a book on a large bookshelf. Above it reads “Free people read freely”, and below: “Defend the Internet Archive” — Linocut relief print by Molly White (CC BY 4.0)

On September 4, the Second Circuit Court of Appeals upheld most of^a a lower court’s March 2023 ruling that aspects of the Internet Archive’s digital booklending program violate copyright law. The case was brought in June 2020 by four publishing powerhouses: Hachette, HarperCollins, Wiley, and Penguin Random House. And when I say powerhouses, I mean it — Hachette, HarperCollins, and Penguin Random House are three of the “big five” publishers who (along with Macmillan and Simon & Schuster) collectively controlled around 80% of the trade market for books in the United States as of 2022.¹ Hachette and the other plaintiff publishers have argued that, by lending out one-to-one digital copies of books they have legally purchased, the Internet Archive’s Open Library is infringing upon the publishers’ copyright and damaging their sales. And, without any evidence of actual harm to the publishers, the Second Circuit went right along with it. They also went a step further, again without evidence, to suggest that libraries are inherently detrimental to society.

The Internet Archive is a digital library, archive, and unsung hero of the web. Best known for its Wayback Machine, the service that crawls much of the web to preserve copies of hundreds of billions of webpages on hundreds of petabytes of storage, the non-profit Internet Archive also operates a whole host of other services. Among them is the Open Library: a project with the lofty goal of creating a webpage for every book ever published. Ideally, these pages will all contain full-text versions of the books, print copies of which are each either purchased by or donated to the Internet Archive, and then scanned and made searchable. Many of these books are (or were, until this case) available in the Archive’s controlled digital lending program, a model that replicates the traditional one-to-one library lending model for physical books. With the Open Library’s CDL program, a scanned copy of any given book is loaned to a single patron for a period of up to two weeks, and during that time, the physical book^b and its digitized version are unavailable to others until the book is returned. These digital books utilize digital rights management (DRM) software to prevent patrons from creating and re-sharing their own copies.

While the Open Library program offers similar benefits to library e-book programs, and digital scans of physical books share some similarities to e-books, these things are crucially not the same. For one, there are no geographical or institutional requirements to access materials offered through the Open Library, unlike regional public libraries that typically require proof of residency within that library’s territory, or academic libraries that require university affiliation. There is also, critically, no large-scale surveillance of readers akin to what is happening via many traditional e-book providers. Secondly, the Open Library makes it possible to link directly to a book: something perhaps dismissed as trivial, but which is truly invaluable when it comes to providing verifiable references that you expect people to widely be able to verify. Thirdly, although it was overlooked by the court in this decision, the scanned books are not one-to-one replacements for e-books, which tend to be much easier to read, and come with bells and whistles that allow you to do things like adjust the appearance (font size, color scheme, etc.), navigate throughout the book from a table of contents, view endnotes inline, and navigate to links from the book text.

A scan of a somewhat yellowed book, showing the dedication (“for Sally”) and the first page of Hitchhikers Guide to the Galaxy. — A page from the Open Library’s copy of *The Hitchhiker’s Guide to the Galaxy*. E-books distributed by publishers tend to be a lot more reader-friendly.

Digital booklending might seem like something that ought to be simple and beneficial to all parties — readers, authors, libraries, booksellers,^c and, yes, publishers — and it ought to be! It is not.

Rather than implementing their own expensive and labor-intensive lending systems, libraries typically sign up with a provider like Overdrive (the creator of the Libby app) or Hoopla. Through these platforms, they purchase metered e-book licenses at rates that are typically multiple times what they would spend on a physical copy of the same book.^d These licenses permit the libraries to lend out their e-books, typically, to a single patron at a time per copy, for a fixed number of times or for a fixed duration. This is ostensibly to mimic the wear and tear on typical physical books that forces libraries to periodically purchase new copies, but in reality seems to reflect hypothetical wear and tear on books if they were made of tissue paper and loaned only to people who promise to exclusively read them in the bathtub.^e Other restrictions may also apply: for example, some publishers only allow each library to purchase a single e-book copy of newly released books, for fear of libraries “cannibalizing” their print sales.² Some e-book publishers do not offer library licenses on any terms whatsoever.³ The whole model is premised on the idea that libraries and their patrons are the enemy of publishers — and, by extension, the authors they claim to represent.

Publishers treat this new^f e-book lending model as some sort of natural law of How Digital Books Must Be Loaned rather than a horrendously extractive scheme they’ve recently come up with themselves, to benefit themselves at everyone else’s expense. To be clear: this model is not something enshrined in law (yet) or based in the fundamental principles behind copyright (a legal concept which, I must point out, was designed not for the purposes of enriching publishers or even authors, but rather to promote the progress of arts and science). Today’s e-book lending is a system created by the publishers, for the publishers, and it is one which those publishers are now working hard to codify and protect.

This e-book lending model is also nothing like the model for physical booklending in the United States, where a library can lend out any book they want, whether they purchased it new directly from a publisher or bookseller, purchased it used, received it as a donation, or, hell, found it on the side of the road. They own the book, they can lend the book, no further discussion necessary.^g There is no special expensive “library license” required,^h because the rightsholder has sold that copy of the book and, in doing so, exhausted their rights to the item.

In fact, by fighting CDL, publishers are seeking to overstep the established boundaries of intellectual property law to exert continued control over an item that has already been purchased from them. And they are seeking to diminish the critical rights of readers to read the books they want without being subjected to censorship and surveillance. This is part and parcel with other attempts by digital publishers — of books, but also of films, video games, and other media — to turn media purchases into rentals, so as to extract endless money and private data from their customers.

A LED sign truck reads “No tech reading over abortion patients’ shoulders” on the side, and “Defend the Internet Archive” on the back panel. — (Brandon Colbert, via Fight for the Future)

The publishers in the Hachette case are doing this by arguing that the Open Library, and other CDL programs, are infringing upon their copyrights by making unauthorized copies. The Internet Archive rightly argues that their copying constitutes fair use, a complex doctrine that involves tests around the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the copying, the effect upon the copied work’s value, and the benefit to the public.

The court had to do some pretty weird gymnastics to convince themselves that the Open Library’s scanning was not transformative, that the Library is copying more than is necessary, and finally that the lending hinders publishers’ abilities to sell books. Once they convinced themselves it wasn’t transformative, they then leaned on that heavily throughout the rest of the case, rather than considering other arguments properly.

Mike Masnick at TechDirt has already picked apart the court’s fair use analysis in detail, so I won’t repeat his excellent work, but I do want to linger on the final point of the analysis for a moment: The court decided not only that the digital copies “function as a competing substitute” for e-books, but that widespread lending in this fashion “would decimate Publishers’ markets for the Works in Suit across formats.” And to be clear, the widespread lending that we’re talking about is the very same type of lending performed by traditional libraries with physical books. They are saying libraries will decimate publishers’ markets.

Mind you, the Internet Archive provided very convincing evidence to challenge the publishers’ unsupported claims that the lending program was causing them harm. The Archive cited an expert economist who performed a study that found there was no measurable impact from the CDL program on the demand for e-books from libraries — and even no measurable impact during a brief period when the Internet Archive removed the one-to-one limits on digital lending (more on that in a moment). An expert library administrator told the court she had no awareness of any incidents throughout her decades-long career in which the existence of a book in a CDL program impacted libraries’ decisions to purchase e-book licenses. And when looking at retail e-book sales, the economist found no support for the claim that people were buying fewer e-books when those books were available via the Open Library. A second expert economist performed a similar analysis on print sales and reached the same conclusion. The Internet Archive’s conclusion on this point was convincing:

The absence of any effect on Publishers’ markets for book sales makes sense. First, when assessing the effect of controlled digital lending, the relevant comparison is traditional library lending, not no lending at all. Publishers’ complaint that people will not buy books they can borrow for free applies to all library lending. Second, unlike traditional libraries, which lend books immediately after publication, IA waits five years before lending—after most of the book’s lifetime sales have already occurred. (90% of lifetime sales to date for most Works in Suit occurred in first five years). Third, borrowing a book may actually increase sales through the “discovery effect” when borrowers who enjoyed a book buy a copy or recommend it to others.

Despite that, the Second Circuit decided without much in the way of convincing explanation that the expert opinions were “ill supported”, and continues:

Although they do not provide empirical data of their own, Publishers assert that they (1) have suffered market harm due to lost eBook licensing fees and (2) will suffer market harm in the future if IA’s practices were to become widespread. .... We agree with Publishers’ assessment of market harm.

In other words, they didn’t think the Internet Archive’s evidence was strong enough, but the publishers’ “we said so” was perfectly adequate.

Wheatpaste posters with an owl that say: “Defend the Internet Archive. We need digital libraries. Let them own books. BattleForLibraries.com” — (Randi Rosenblum, via Fight for the Future)

Possibly the worst part of this decision, however, is the analysis and rejection of the Internet Archive’s argument that their lending provides public benefit that would outweigh market harm to the publishers, even if that harm was real.

The list of public benefits from the Open Library is too long to reasonably reprint here, though I will provide a few testimonials. Others are available on the Internet Archive’s blog, Twitter account, and on BattleForLibraries.com:

“Books in Vietnam are significantly less accessible and my economic background doesn’t allow me to afford these things.” — Tran in Vietnam
“Internet Archive gives me access to scholarly information that is not afforded to those outside of the post-secondary education system. The Internet Archive helps bridge the gap when it comes to literacy, comprehension of history, and the discovery of new works that are otherwise gate-kept from the average person.” — Tamia in Canada
“Most of literature I’ve been using from IA are ones I couldn’t find in my city’s library, either public or academic. Without IA, my academic progress would be halted.” – Poppy in Indonesia
“Internet Archive had everything I needed to go through college, whilst not having ANY library available in my home country and with college books costing hundreds of dollars on top of import fee and taxes (which alone could be the salary of a person here).” – Jefferson in Nicaragua
“Internet Archive allows me to search a large number of books by keyword/name and it triggered my buying a lot of hard copies of books I would have never even known existed.” – Chloe in the United Kingdom

Others have cited relying on the Open Library to get access to banned books, provide links for fact checking, access books that are not available for purchase in their country, or get digital access to print books they cannot read due to print disabilities or disabilities that prevent them from going to a physical library.

Personally, I rely on the Open Library extremely heavily as a Wikipedia editor. Wikipedians are all volunteers, and in addition to not being paid for our work, there is no stipend with which to purchase reference material. I do routinely purchase reference material out of pocket, but especially now that I am self-employed, the number of books I reference would far outpace my budget if I were to purchase them all, and so I rely heavily on libraries. My local public library and programs like the Wikipedia Libraryⁱ go a long way, but I regularly come across sources that aren’t available in those places (either already cited within Wikipedia articles but in need of verification, clarification, or expansion, or that I am hoping to use for new writing).

To give a concrete example, I wrote a Wikipedia article on the Come!Unity Press collective two weeks ago that I would not have been able to write without access to a copy of an issue of Signal: a Journal of International Political Graphics & Culture (which, fortunately, is not among the roughly 500,000 works the Internet Archive has already been forced to remove from the Open Library). While writing it, I came across a second Wikipedia article that mentioned the collective, which contained a reference I was unable to verify because the book it cited was removed, isn’t available through my public library, and would cost me at least $30 plus shipping just to do a 30-second reference check.

Person arguing with me in DMs that the Hachette decision is good for authors and not detrimental to readers who can just use a library: “I’m predicting that your IA experience will not be significantly diminished.”

Me, literally yesterday: pic.twitter.com/odqcmxVgsg
— Molly White (@molly0xFFF) September 4, 2024

Despite the truly unquantifiable benefit to the public, the Second Circuit decided:

Within the framework of the Copyright Act, IA’s argument regarding the public interest is shortsighted. True, libraries and consumers may reap some short-term benefits from access to free digital books, but what are the long-term consequences? If authors and creators knew that their original works could be copied and disseminated for free, there would be little motivation to produce new works. And a dearth of creative activity would undoubtedly negatively impact the public. It is this reality that the Copyright Act seeks to avoid.

In other words: even though libraries have been around far longer than the Copyright Act itself, libraries are now a threat to authors. The true meaning is clear: publishers’ abilities to extract exorbitant rents and exert control over readers outweigh the incredible benefits of increased public access to books.

Addendum: National Emergency Library

Publishers have seized on a brief program by the Internet Archive to vilify their controlled digital lending program, and it seems they have had some success in misleading the public on this point, so it is necessary to address it here.

The National Emergency Library was a decision by the Internet Archive lift the one-to-one lending restrictions on its online catalog during the early stages of the COVID-19 pandemic, when many libraries were closed. On March 24, 2020, the Archive announced that for a temporary period:

Users will be able to borrow books from the National Emergency Library without joining a waitlist, ensuring that students will have access to assigned readings and library materials that the Internet Archive has digitized for the remainder of the US academic calendar, and that people who cannot physically access their local libraries because of closure or self-quarantine can continue to read and thrive during this time of crisis, keeping themselves and others safe.

On June 16, the Archive ended the program two weeks earlier than intended due to the lawsuit from the publishers. Many have suggested that the Internet Archive provoked the lawsuit by implementing the NEL, and that if only they hadn’t pissed off these powerful publishers, this wouldn’t have happened.

For one, that gets the history wrong. As Gio points out, the plaintiffs themselves have acknowledged that they were preparing their lawsuit against the Internet Archive well before the NEL:

As a point of clarity, we sued Internet Archive on June 1, 2020, for its entire practice of “controlled digital lending,” not only the extra-extreme version that it rolled out in March 2020 with its hyperbolic “National Emergency Library” (NEL) and shut down on June 16, 2020, shortly after the U.S. Copyright Office suggested it was likely outside the bounds of fair use. We previewed a suit in February 2019 with this public statement, which regrettably was ignored. When the pandemic hit, the underlying suit was already being prepared.

Furthermore, the opinion extends to one-to-one controlled digital lending, not just the more extreme lending through the NEL. Even if the court decided the NEL lending was disallowed, that is not itself a reason to overstep and prohibit controlled digital lending entirely. “They had it coming” is typically not a great legal theory.

There is some extremely weird victim blaming happening around this case, like the Internet Archive shouldn’t have been walking down that alley wearing such a short skirt.
— Eva (@evacide) September 4, 2024

Artificial intelligence

This decision is coming at a strange time, as AI companies have been openly training models on every scrap of content they can get their hands on, copyrighted or not. This, quite understandably, rubs a lot of people the wrong way.

Indeed, there are court battles playing out between publishers, authors, artists, and others who contend that these AI companies are infringing their copyrights. Many have been dismissed pre-trial. Those that haven’t have not yet gone to trial.

While I agree that this just feels wrong, I disagree that copyright is the tool with which to protect artists and writers against non-consensual AI scraping. Copyright has, generally speaking, been a bad deal for actual creators, and it is the media monopolies that have reaped the benefits of copyright expansions. As Cory Doctorow writes:

Under these [monopoly] conditions, giving a creator more copyright is like giving a bullied schoolkid extra lunch money. It doesn't matter how much lunch money you give that kid – the bullies will take it all, and the kid will still go hungry (that's still true even if the bullies spend some of that stolen lunch money on a PR campaign urging us all to think of the hungry children and give them even more lunch money).

If any plaintiffs prevail in these copyright suits, it is the big tech companies and publishing conglomerates that will benefit — not creators. Creators need true worker protections, and should not buy the story that copyright will somehow protect them this time when it has demonstrably done the opposite in the past.⁴

It could be worse

There’s a lot of bad news in the Hachette decision. I am both devastated and terrified by it. I am hoping the Internet Archive will appeal to the Supreme Court, but I am also extremely cynical about this Supreme Court’s ability to make any good decisions, and frightened by the possibility they could set damaging precedent.

However, although there is a lot of bad news, it is not all bad. I wouldn’t say any of it is really good news, per se — but it could be worse.

Some interpreted this recent news about the lawsuit to mean that the Internet Archive or the Open Library will be shutting down wholesale. I haven’t seen anything that suggests that, and would be surprised if it came to it as a result of this case. Beyond this case, there are constant threats to the Internet Archive (such as a separate lawsuit from a group of music industry giants^j seeking $400 million in damages) that could be existential. As I’ve pointed out elsewhere, the Internet Archive’s whole existence pushes the boundaries of copyright law, and so threats like this are a part of the territory. But hopefully, and with our support, they will continue to weather the storm.

Finally, although around 500,000 books have been removed from the Open Library’s lending program (including 1,300 banned books) at publishers’ request, many still remain. The Internet Archive is still able to make the removed books available via programs including interlibrary loan and their project to provide access to those with qualified print disabilities. The Archive is also still able to display short previews of removed books, such as where Wikipedia citations reference a specific book page. Finally, the decision does not impact the lending of books that do not have e-book versions offered for sale.

We still need to fight like hell to reverse this decision, preferably not just by seeking to have it overturned in the courts, but by proactively enshrining in law the right for people to read freely, and creating properly equitable protections for writers and other creators that do not pit them against those seeking to enjoy their work. Despite what publishers might like you to believe, readers and libraries are not threats to authors^k — they are allies.

Get involved

Footnotes

The appeals court did, thankfully, overturn the lower court’s ruling that, by soliciting donations as a non-profit, everything the Internet Archive does becomes “commercial activity”. This is at least a moment of sanity in an otherwise highly problematic decision, as the original ruling would have, if upheld, turned practically any activity by non-profits into “commercial activity”. However, the finding that the Internet Archive’s activity is non-commercial did not ultimately help much as far as the remainder of the decision, as non-commerciality is not the sole determiner of fair use. ↩
Physical copies of books owned by the Open Library are put into archival storage and not made available for physical lending. Other libraries that implement CDL sometimes lend both digital and physical copies of the same book, making the physical copy unavailable for lending while borrowed digitally, and vice versa. The Open Library partners with some of these libraries to increase the number of copies of books already within its collection that are available for lending. ↩
The predatory model also extends to e-books purchased directly by consumers, rather than borrowed through libraries, but that’s off topic for this piece. ↩
A 2019 Congressional brief by the American Library Association gave two examples of what they describe as “abusive” pricing practices for libraries by e-book publishers. An individual seeking to purchase an indefinite-access e-book copy of the 1967 non-fiction book The Codebreakers could get it for (the still incredibly high price of) $59.99; a two-year library license to lend one copy of the book to a single person at a time was $239.99. An individual could purchase an e-book version of the 2014 novel All the Light We Cannot See for $12.99; a two-year library license was $51.99.³ A comparison published by the Timberland Regional Library in 2024 likewise found markups for two-year library e-book licenses to generally be around 3–4× the cost of a physical copy.⁵ ↩
HarperCollins permits libraries to lend e-books only 26 times before they self destruct. Other publishers offer duration-based e-book licenses, which are often 12 or 24 months. ↩
And it is very new, despite those fond of arguing that CDL is somehow a violation of “how we’ve always done things”. Publishers used to sell indefinite licenses to e-books in their catalog for approximately the same price as physical copies; in 2013 they jacked up prices; and in 2018 or so they largely switched to metered access models (without reducing said jacked-up prices).⁶ ↩
Well... no further copyright discussion necessary. Sadly there are other discussions libraries often have to endure, particularly in recent years, and particularly if a book dares to suggest that queerness or racism are things that exist. ↩
I have come across a few people who think this is a thing, and my best guess is that they have developed this belief after coming across “library editions” of books. These are typically hardcover copies that are printed to be more durable than typical retail copies to stand up to heavier use. They do tend to be a little more expensive, but it has nothing to do with rights. ↩
The Wikipedia Library is a wonderful program by the Wikimedia Foundation that partners with academic publishers, newspapers and newspaper archives, and other publishers of paywalled materials to provide free access to active Wikipedia editors the types of sources that typically require university affiliation. ↩
Noticing a pattern? ↩
The weird framing of “readers vs. authors” also ignores the rather plain fact that many if not most authors are themselves heavy readers. Many also rely heavily on tools like the Open Library for research. ↩

References

“The planned Penguin Random House-Simon & Schuster merger has been struck down in court”, Vox. ↩
“You May Have To Wait To Borrow A New E-Book From The Library”, NPR. ↩
“Competition in Digital Markets”, American Library Association brief before the U.S. House of Representatives Committee on the Judiciary. ↩
Further reading on copyright, protections for artists, and AI: Chokepoint Capitalism by Rebecca Giblin and Cory Doctorow. “How Allowing Copyright On AI-Generated Works Could Destroy Creative Industries”, TechDirt. “Stop Rushing To Copyright As A Tool To ‘Solve’ The Problems Of AI”, TechDirt. “If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech”, TechDirt. ↩
“The Real Costs of Digital Content: eBook and Digital Audiobooks”, Timberland Regional Library. ↩
“eLending position paper”, Readers First. ↩

Social share image is “Free people read freely”, a linocut print by Molly White, CC BY 4.0.

Sidenotes

Big publishers think libraries are the enemy

Molly White

Addendum: National Emergency Library

Artificial intelligence

It could be worse

Further reading

Get involved

Footnotes

References

Read more

Digital asset treasury companies are running out of steam

Issue 97 – This is hardship

Issue 96 – Redefining solvency

Trump says he has “no idea” who he just pardoned