• Date of Class: 2/12
  • First due: 2/19
  • Comments due: 2/26
  • Revisions due: 3/4

Our class started off by reviewing the main ideas of each reading as well as their sources to better establish the various point of views present in the discussion. Starting off with the first reading, The Authors Guild v. Hathitrust, No. 156 was an offical court document that presented the legal case and proceedings between the Authors Guild and Hathitrust. This document was written by the United States District Court Southern Distric of New York. It is important to emphasize that this case took place in 2012 which was the beginning of the end for the Authors Guild v. Google Books case. The second document was Code: A Short History of Book Piracy. This document was describing the history of piracy and political influences in media in regards to censorship of publishing, printing, and distributing text with a focus on Eurpoean history. The next document titled, Brief of Digital Humanities and Law Scholars as Amici Curiae in Authos Guild v. Google, was a supporting document defending Google. This document went into the details of why Google Books was not infringing on copyright laws, and presented various arguments on the distinction between Google Books as a tool versus a library, between “fact” and “expression”, and what is “fair use” in the world of digitization. The fourth document was a Wired artile titled, How Google Book Search Got Lost. This article analyzed what happened during the Authors Guild v. Google Books case, why Google Books vanished after the fact, and what Google Books should have done differently to avoid the legal battle in the first place. The next artile titled, Authors Guild v. HathiTrust Litigation Ends in Victory for Fair Use, was focused more on the digitization process of books by libraries and the fair use of establishing digital libraries. The next document titled, Summary: Authors Guild v. Google Inc, was a document from the US Copyright Office that outlined the most important and basic facts of what the case was about and what happened. The last document was a research artile titled, Reconstructing Textual Documents from N-grams, which walked through a successful experiment that was able to reconstruct documents with very limited snippits of information.

Once we established the various topics, we began with the third reading (Brief of Digital Humanities and Law Scholars as Amici Curiae in Authors Guild v. Google) and discussed the difference of “reading” versus “analyzing” a Google Books text. I personally did not realize that digitally analyzing or text mining would not be considered as an act of reading, but as a form of “non-expressive use”. This article further argues that non-expressive use may lead “to additional expressive” models or visualizations of data such as Google n-gram viewer. However, the actually digitzation of the books in order to create these databases is not a copyright infringement due to the fact that copying information to create metadata associates that data with facts and not the author’s original expression. This concept then lead us to discussing what we thought was the difference between fact and expression, and that viewing digitized text as metadata rather than normal text is the one of the core reasons why the Google Books was able to win the case. In reference to the copyright law, this document stated that, “protection granted to a copyrightable work extends only to the particular expression of an idea and never to the idea itself”, which lead to my question as to what is qualified as author’s original expression. We discussed how different genres make it more clear as to what author’s original expression is. For example, fictional literature is considered as orignal expression because it is not factual information and is created by the author. However, when we started debating over research articles and texts that quote factual information, we were unsure as to whether quoting, citing, or referecing other information in order to develop the author’s original expression should be considered as a copyright infringement or not. In order words, is crediting the original source enough to prevent copyright infringment, or should there be other forms of attributution such as monetary compensation? As students we felt that providing monetary compensation everytime we quoted or referenced texts in our papers that it would be impracticle. However, when changing the example as student papers quoting other sources to Google Books providing “snippets” (which are like quotes) of digitized books, providing monetary compensation to the original authors or publishers didn’t seem as impracticle. We furhter discussed the issues around this gray area of how Google Books is only providing “snippets” of information when used as a research tool and not a library, and started to became more convinced that the digitization process was not a copyright infringement.

Unfortunately, we were still left with a few loose ends. For example, when bringing the last reading into this discussion (Reconstructing Textual Documents from N-grams. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining), this experiment proposed that it is possible to reconstruct texts from databases like Google Books that only provide “snippets”. This possibility of reconstructing works changed our perspective on how much power Google Books actually has as a tool, and how they barely need any licensing or permission in order continue growing their database and digitize more books. We began questioning as to why the Authors Guild did not have more convincing arguments to fight for the rights of authors and publishers, and discovered that the Authors Guild is relatively a very small group in comparisoin to a company like Google. We proposed that there lack of larger representation may have been one of the causes for their arguments to not be as convincing or strong, especially when taking into consideration the American Disabilities Act which was supported by Google and HathiTrust as a means to allow for digitization of books and texts. Furthermore, when transitioning to the Wired article, the case mainly focused on the custody of orphaned books (books out of print but under copyright) between the two entities. Since the authors of the books that were in question were all dead even though the copyright was still valid, we concluded that the lack of active representation may have weakened their argument.

However, my main question while reading this article was, why did the judge rule against the first compromise of the Google Books settlement? If it would have been approved, authors and publishers would have been provided with monetary compensation, and the text would be available as a digital database to use. However, in our discussion we realized that if the settlement had been passed, while Google would have agreed to provide monetary compensation, they would not be actively responsible for reaching out to authors and publishers to notify that they have digitized their work. However, the more pressing issue would be that private for-profit companies would be able to do the same thing and profit off of orphaned books, and not be required to allocate monetary compensation to the authors of orphaned books due to them not being alive.

I related this point to how the music industry is also facing issues with copyright and digital distribution in that digitization companies such as Spotify, Pandora, etc, are able to significatly profit from artists’ works while the actual artists do not recieve nearly as much monetary compensation as they should. In private remix settings on Soundcloud, samples/”snippets” are taken or “quoted” from original artists in order to create a new work. Aside from attribution of credit to the original artists, there is no monetary compromise or legal copyright case between the relationship of a remixer and the original artist. On the other hand, when using “snippets” of a song in a YouTube video, these cases are considered as legal copyright infringements simply due to a change of digital interface and audience. The main difference between the two platforms in music distribution narrow down to who gets paid for the content published which is the root of the argument for why private for-profit companies would most likely take advantage of original content creators (artists/authors/etc) if compromises like the Google Books settlement are approved. Furthermore, specific examples of privately owned music companies in the early 2000s that sold digitized songs, and claimed to provide the original artists with monetary compensation, but did not go our of their way to do so. We also discussed how Taylor Swift’s legal battle with her management company prevented her from performing and profiting from her older songs, becuase all of her older recordings belonged to her record label and not to her even though she is the creator. After going through similar examples and drawing comparisions in injusticies between the music industry and the publishing industry, we were in agreement that the original Google Books settlement would have been very unjust. We also breifly mentioned how a similar settlement would cause a distruption in the patent industry. With the increase in digital tools allowing for the mass distribution and sharing of patents, texts, music, and visual art, has allowed for more methods to claim creator and usage rights.

In regards to the rest of the Wired artile, we also discussed why the Google Books project ended up vanishing after winning the legal case. Aside from spending over ten years in a legal battle, with the emergance of ebooks, audiobooks, and new forms of digitization that were originating from the publishing indsutry including online distribution and subscription services, the time, money and effort required to continue the Google Books project would lead to little payoff. In the end, the project would be more of an analysis and reseearch tool, rather than its original vision of a “library of utopia”. We also slightly touched upon how Google Books changed the publishing industry with the possibility of making money by selling digital texts as well as physical copies. A proposed experiment from this reading would be to explore more on how the digitization process of traditionally physical media (text/music/art) has helped or harmed the labor forces in the publishing, printing, and distribution sectors.

Our conversation regarding private companies profiting if the Google Books settlement was approved lead us to the second reading about piracy (Coda: A Short History of Book Piracy. In Joe Karaganis, Media Piracy in Emerging Economies). Although we did not dive into the depths of this document, we focused on the two roles of pirate publishers in history played as stated in this document that, “they printed censored texts, and they introduced cheap reprints that reached new reading publics”. I brought up the question of whether or not today’s publishing companies censor works or require reprints in anyway. We discovered that today’s publishing companies my not censor directly, however in political settings, when publishing a book as a government offical in the White House (specically referenced the book about the Muller Report), there is a procedure that take place to review the contents of the to-be-published text to make sure that classified information is not disclosed. In regards to requiring reprinting, we discussed how textbooks need to be reprinted with updated editions in order to keep books in circulation and continue to bring in profits. Our last example in regards to censorship was specifically towards private companies such as Amazon. Although they are not a publishing company, as a distribution service they do not sell any pro-Nazi texts. As a group we agreed with this form of censorship, and understod that as a company one would not want to associate with socially inappropriate opinions or view points. However, while we did not have time to address this question, it should be questioned as to whether private companies that distribute text or any form of media are social responsible to similar forms of censorship or not.

With the little time left at the end, we very briefly concluded with the fifth reading regarding the ability for libraries to be able to digitize their texts, and why the Google Books and HathiTrust cases against the Authors Guild was as successful as it turned out to be (Authors Guild v. HathiTrust Litigation Ends in Victory for Fair Use. Association of Research Libraries Policy Notes). This was mainly due to the support for the American Disabilities Act in that without digital access of ebooks or audiobooks in public libraries, it would be viewed as a denial of equal opportunity and discrimination against individuals with disabilites. While we did not have time for enough time to go into depth, this argument was extremely pivotal in pushing for digital resources and databased in libraries, and voiding the digitization processes as a copyright infringement. I personally believe that this argument was extremly important and directly ties into the Google Books case when discussing the terms of “fair use” of digitized texts. We can draw a tie of “fair use” of texts for all individuals regardless of physical or mental disability to the universal human right to education. By not allowing for digital access of text in order to provide an educational service for all individuals goes beyond the copyright infringement battles between Google Books or HathiTrust and the Author’s Guild, but a moral and ethical violation of human rights. Therefore, while this argument was used in a copyright injustice case, its value in terms of equal opportunity and education is far more significant and impactful in the long run as more texts and media formats become digitized.