KO

COPYRIGHT: M02T04 - Compilation Challenges in Databases

Special Challenges of Compilations in Databases

  • Databases present unique challenges at the intersection of information access, data extraction, compilations, and copyright law, particularly magnified by the emergence of big data and advancements in artificial intelligence.

Key Terms and Definitions
  • The 1976 Copyright Act lacks specific definitions for "data" or "databases," leading to ambiguity in their legal treatment.

  • Compilations: Defined in Section 101 of the Copyright Act as works formed by the collection and assembling of pre-existing materials or data that are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship. Originality necessitates creativity in the selection, coordination, or arrangement of data.

  • Feist Publication, Inc. v. Rural Telephone Service Co. (1991): This landmark case established that the "sweat of the brow" doctrine does not confer copyright protection; originality requires creativity and not merely effort or labor.

Originality in Compilations
  • The scope and nature of choices made in the arrangement of elements are critical factors in determining a database's originality and, consequently, its copyright protection.

  • Although the originality rule seems straightforward, its application is often ambiguous and contentious in practice.

Hypothetical: AI-Generated Art
  • Scenario: A client commissions a painting, titled "The Next Rembrandt," created using artificial intelligence (AI) by analyzing and replicating elements from Rembrandt's existing works. Assume all original Rembrandt paintings are still protected by copyright.

  • Question 1: Does the AI-generated painting qualify as an original work of art?

    • If the AI-generated work does not copy protected elements from the original Rembrandt paintings and exhibits a modicum of creativity in its composition, it may be deemed an original expression.

    • A thorough investigation is essential to ascertain that the AI-generated painting is not substantially similar to any specific copyrighted Rembrandt works.

  • Question 2: Does the AI-generated painting meet the criteria of a compilation, considering it was created using an AI program trained on a database of all identified Rembrandt paintings?

    • If the AI-generated work involves arranging or combining elements from authorized Rembrandt paintings, it could be classified as a compilation.

    • Copyright protection for such a work would be limited to the unauthorized reproductions of the work as a whole, extending only to the selection and arrangement of pre-existing works.

The Database Distinction
  • Factual data within databases, if consisting purely of facts and not pre-existing copyrighted works, generally does not qualify for copyright protection.

  • Feist underscores a tension between the unprotected status of facts within a database and the potential copyright protection afforded to the compilation itself.

  • Addressing and resolving this distinction presents ongoing challenges in copyright law.

Section 101 and 103 Emphases
  • Section 101: Focuses on the protectability of compilations, emphasizing that the selection, arrangement, and coordination of data must qualify as original to warrant copyright protection.

  • Section 103: Clarifies that an author of a compilation can only protect their original contributions to the database.

  • This framework results in a "thin copyright," wherein the use of unprotected data cannot be legally restricted.

Creation of Databases and AI-Generated Works
  • Compilations typically incorporate pre-existing materials; incorporating copyright-protected works into a database does not grant additional rights over those pre-existing materials.

  • Many AI-generated databases rely on others' works for machine learning purposes, potentially leading to copyright infringement.

AI-Generated Work: The Next Rembrandt Revisited
  • Further consideration of the database created to train the AI system for producing "The Next Rembrandt."

  • Question: Does the database itself possess originality?

  • If the database creation involved no choices in terms of content inclusion, as the intention was to create a painting in the style of Rembrandt, the resulting work may not be genuinely new.

  • Taxonomies are significant; absent choices in what should be included, the database may not be eligible for even minimal copyright protection.

Big Data and Compilations
  • Databases play a crucial role in big data applications, such as those used for targeted advertising. The copyrightable nature of these compilations becomes particularly complex when they consist primarily of factual data.

  • Section 101's definition of compilations raises intricate questions about originality, especially when the compilation comprises facts.

Experian Case Analysis
  • Experian's Case: Experian compiled personal data (e.g., name, address, purchasing history) for use in marketing campaigns.

  • Value as a Criterion: The Court clarified that value is not the determinant for copyright protection; originality is paramount.

  • Feist Standard: The standard emphasizes creativity in the selection, arrangement, and presentation of data.

Experian's Data Compilation Process
  • Data Compilation: Experian compiled data from multiple, reliable sources, verified data sources, checked for inconsistencies.

  • The court found Experian's Compiled Consumer View Database (CVD) protectable.

  • Data Removal: Removed data for individuals deemed not valuable for marketing (e.g., those in prison or the very elderly).

Originality in Experian's Database
  • The court emphasized the need to select actual data and not accept all data presented.

  • Number of sources used and determination of which data to include for specific purposes were critical.

  • Cited the database of business interests for the Chinese community in Key Publications as a favorable example.

  • Experian made choices in data selection rather than using all data from a single source.

Sweat of the Brow vs. Creative Choices
  • Rejection of "Sweat of the Brow": The "sweat of the brow" doctrine—industrious collection—was rejected as a basis for originality in Feist.

  • Culling Process: The question arises whether Experian's culling process simply represents another version of the rejected "sweat of the brow" doctrine.

  • Creative Choices: It is questionable whether Experian's choices regarding inclusion in a consumer marketing database constitute true creativity, particularly considering the low level of choice involved (removing erroneous names, inmates, and the elderly).

  • Treatment of Culling: A critical question is whether the treatment of the culling process in Experian aligns with McLean Hunter.

  • Threshold of Originality: The fundamental question is whether any level of choice can elevate a database to the threshold of originality.

Revisiting the Next Rembrandt
  • Question 1: If the database used to create the painting were composed of all known Rembrandt paintings, would it qualify for copyright protection under Experian's reasoning?

    • The key consideration is whether there is sufficient selectivity to demonstrate creativity.

  • Question 2: If the painting itself is considered a compilation, would it meet the originality standard for compilations?

Database Originality: The Next Rembrandt Case
  • Database Creation: The creation of the database might qualify for originality if choices were made in determining which works are authentic Rembrandts. This includes whether to include or exclude works attributed to, but not confirmed as, authentic.

Protectability of the AI-Generated Portrait
  • The portrait should be protectable as a compilation because choices were made to create a work not substantially similar to pre-existing works.

  • Since it doesn't look like a single painting, the AI database likely took elements from various sources.

  • Whether it ultimately qualifies as copyright protectable due to its AI authorship is a separate question.

Copyright Issues with AI-Generated Content
  • Creating a new AI-generated Fred Flintstone cartoon raises serious issues because the cartoons forming the database are copyright protected.

  • Using copyrighted works without permission to create a database for an AI program can violate the copyright holder's rights.

  • If the database violates copyright in the pre-existing works, the new Fred Flintstone cartoon also violates those rights.

Database Taxonomies
  • Taxonomies or classification schemes aren't automatically excluded from eligibility as a system or process under section 102(b).

Protectable Taxonomies
  • Explore when taxonomies are protectable and the impact of that protection on the ability to use the components used to create the taxonomy.

Compilation and Originality
  • If a compilation is created through the exercise of opinion or judgment in selecting items, it's likely original.

  • If different people would create the same database, it shouldn't be protectable.

Telephone Directories and Originality
  • Why isn't creating a telephone directory of all residents in a given location an original taxonomy?

Classification Systems and Creativity
  • In American Dental Association, the court recognized that classification systems can be creative and subject to protection.

  • The creativity of the taxonomy marks the expression even after the fundamental scheme has been devised.

  • Even short descriptions and classification numbers qualify as original works of authorship.

Parts Numbering System and Originality
  • In WHATEVER IT TAKES Transmissions, Inc. v. Gordon, the court addressed protection for a parts numbering system.

  • The court cited American Dental's language but held that original and creative ideas are not copyrightable.

  • How can an idea be creative and not original?

  • Can this rationale be reconciled with Feist and McLean Hunter?

Randomness and Creativity
  • To support its determination, the court in WHATEVER IT TAKES emphasized the randomness of the numbering system.

AI-Generated Works Revisited: Next Rembrandt
  • Consider the database created for the AI-generated work.

  • Is there anything original in the choice of what to include?

  • Are you limited by the fact that if you choose to do a work about Rembrandt, you necessarily will include all works identified as Rembrandt's?

Concluding Thoughts
  • Concluding thoughts about compilations, collective works, and the exclusion of ideas, facts, processes, and systems under Section 102.

Collective Works Under Section 101
  • A collective work includes periodical issues, anthologies, and similar works in which a number of contributions are assembled into a collective whole.

Collective Works vs. Factual Compilations
  • Collective works are defined as compilations under Section 101, so the same tests apply for originality.

  • Both are protected only to the extent that their selection, arrangement, or coordination is creative.

Critical Difference
  • Facts are not protectable, but copyrighted works maintain individual creativity.

  • Just because a work is part of a database doesn't make it freely available for use.

  • Owners of works lawfully included in a collective work maintain their right to protect their own rights in the work.

  • The editor of a collective work can only claim originality in selection, arrangement, and presentation.

Surprising Cases
  • McLean Hunter is surprising due to the conclusion that the value of a car is not a fact but a constructed value.

  • Rejection of copyright protection for the Bikram yoga sequence in Bikram's Yoga College of India, L.P. v. Evolation Yoga, LLC.

Bikram Yoga and Healing Art System
  • The court found Bikram Yoga to be an unprotectable healing art system.

  • The rejection of copyright for the sequence of exercises, whose order had been set by the creator, was surprising.

  • Aesthetic choices should have separated this yoga regime from others, but the court did not agree.

Slippery Slope
  • Shows what a slippery slope determining something as a process can be when protecting illustrations.

Medical Benefits and Copyright Protection
  • Statements insisting on the medical benefits of Bikram Yoga made it difficult to argue for copyright protection.

  • The book stated that Bikram's 26 exercises move fresh blood to 100% of the body, restoring systems to a healthy order.

  • The sequence was designed to scientifically warm and stretch muscles, ligaments, and tendons in a specific order.

Choice and Aesthetic Elements
  • Numerous other yoga sequences did not make the selection of the Bikram sequence protectable.

  • The court rejected any argument that the creator's intent to incorporate aesthetic elements should have impacted its decision.

Aesthetic Preferences vs. Function
  • The beauty of a process does not permit one who describes it to gain copyright to exclude others from practicing it.

  • The sequence is considered unprotectable as a process, primarily reflecting function rather than expression.

Copyright Office Rule
  • The Copyright Office issued a rule in 2012 stating that it would not register a system or process for exercise routines resulting in health improvements.

  • However, copyright protection would be available for photographs, depictions, and other illustrations of these routines.

Compilations and Unprotected Materials
  • Feist recognized that compilations could be protected even if composed of unprotected materials (facts and ideas), as long as they showed creativity in selection, arrangement, and coordination.

  • There has been a push to apply compilation analysis to a wider variety of works.

  • Such analysis can provide a basis for seeking copyright protection for works composed of wholly unprotected elements.

  • However, applying this to non-fact-based works could reduce the potential scope of protection.

Thin Copyright
  • No matter how much work you have made in a compilation, you still get a thin copyright.

Analysis to determine copyright protection for compilations
  • When determining protectable expression of a photo, courts sometimes use an analysis that comes close to that used for compilations.

  • The question the court was trying to decide is how much similarity do the two photo share.

Idea Expression
  • Determine copyright protection for compilations, start with the idea expression. What idea was the photographer trying to convey.

Mannion v. Coors Brewing Analysis
  • According to the court, there were three possible ideas. One, a businessman contemplating suicide by jumping from a building. Two, a businessman contemplating suicide by jumping from a building seen from the vantage point of the businessman with his shoes set against the street far below. Or three. Something even more general, such as a sense of desperation produced by urban professional life.

  • The idea that is being expression will have a strong impact on the extent to which the similarities between the two photos are seen to result from the common idea (one that would not be copyright protectable).

  • This is an example of the difficulty of the idea-expression compilation analysis