When Data Scraping and the Computer Fraud and Abuse Act Collide

Mar 1, 2018

By Linda Henry


See all of Our JDSupra Posts by Clicking the Badge Below

View Patrick Law Group, LLC

As the volume of data available on the internet continues to increase at an extraordinary pace, it is no surprise that many companies are eager to harvest publicly available data for their own use and monetization.  Data scraping has come a long way since its early days, which involved manually copying data visible on a website.  Today, data scraping is a thriving industry, and high-performance web scraping tools are fueling the big data revolution.  Like many technological advances though, the law has not kept up with the technology that enables scraping. As a result, the state of the law on data scraping remains in flux.

The federal Computer Fraud and Abuse Act (CFAA) is one statute frequently used by companies who seek to stop third-parties from harvesting data.  The CFAA imposes liability on anyone who “intentionally accesses a computer without authorization, or exceeds authorized access, and thereby obtains … information from any protected computer.”  The Supreme Court has held that the CFAA “provides two ways of committing the crime of improperly accessing a protected computer: (1) obtaining access without authorization; and (2) obtaining access with authorization but then using that access improperly.” (Musacchio v. United States).

The CFAA’s applicability to data scraping is not clear though, as it was originally intended as an anti-hacking statue, and scraping typically involves accessing publicly available data on a public website.  In order to meet the CFAA’s requirement that a third party engage in unauthorized or improper access of a website, companies often argue that use of a website in violation of the applicable terms of use (e.g., by harvesting data), constitutes unauthorized access in violation of the CFAA.

Over the past year, a handful of cases in California challenging the legality of web scraping offer a few clues as to how courts may approach future challenges to web scraping using the CFAA.   In one of the most high-profile cases involving data scraping during 2017 (HiQ Labs, Inc. v. LinkedIn Corp.), a U.S. District Court granted a preliminary injunction requested by HiQ Labs, a small workforce analytics startup, and ordered LinkedIn to remove technology that would prevent hiQ Labs from accessing information on public profiles.  LinkedIn argued that hiQ Labs was violating LinkedIn’s terms of use as both a user and an advertiser by using bots to scrape data from LinkedIn users’ public profiles.   hiQ Labs rejected LinkedIn’s argument that the CFAA applied, and maintained that because social media platforms should be treated as a public forum, hiQ Labs’s data scraping activities are protected by the First Amendment.

In hiQ, U.S. District Court Judge Chen found, in part, that because authorization is not necessary to access publicly available profile pages, LinkedIn was not likely to prevail on its CFAA claim even if hiQ Labs had violated the terms of use.  Judge Chen did note that LinkedIn’s construction of the CFAA was not without basis, because “visiting a website accesses the host computer in one literal sense, and where authorization has been revoked by the website host, that “access” can be said to be “without authorization.  However, whether access to a publicly viewable site may be deemed “without authorization” under the CFAA where the website host purports to revoke permission is not free from ambiguity.”

Judge Chen reasoned that LinkedIn’s interpretation of the CFAA would allow a company to revoke authorization to a publicly available website at any time and for any reason, and then invoke the CFAA for enforcement, exposing an individual to both criminal and civil liability.  He characterized the possibility of criminalizing the act of viewing of a public website in violation of an order from a private entity as “effectuating the digital equivalence of Medusa.”

While LinkedIn waits for the Ninth Circuit to hear oral arguments in hiQ, yet another company (3taps Inc.) has filed a similar suit against LinkedIn, seeking a declaratory judgement that 3taps is not violating the CFAA and thus should be permitted to continue to extract data on public LinkedIn profile pages. (3taps Inc. v. LinkedIn Corp.).  In addition, because 3taps successfully argued that the court should deem the 3taps and hiQ matters related and heard by the same judge, on February 22, 2018, Judge Chen ordered the reassignment of the 3taps case from the Northern District of California’s San Jose court to Judge Chen’s court in San Francisco.

In addition to hiQ, the recent dismissal of a CFAA claim brought by Ticketmaster against a company engaged in data scraping further calls into question whether companies will be successful in using the CFAA to stop web scraping. (Ticketmaster L.L.C. v. Prestige Entertainment, Inc.).  In January 2018, a California district court dismissed Ticketmaster’s CFAA claim with leave to amend against a ticket broker that used bots to purchase tickets in bulk from the Ticketmaster site.  The court noted that although Ticketmaster outlined the defendants’ terms of use violations in a cease and desist letter, Ticketmaster did not actually revoke access authority and implied that defendants could continue to use Ticketmaster’s website as long as the defendants abided by the terms of use. In addition, the court maintained that Ticketmaster could not base a CFAA claim on an argument that the defendants exceeded authorized access unless Ticketmaster could demonstrate that the defendants were inside hackers who accessed unauthorized information.

hiQ, 3taps and Ticketmaster demonstrate the inherent difficulty in trying apply a statute that pre-dates the internet age to modern technology.  Although courts have not been consistent in their opinion as to whether violation of a company’s terms of use constitutes unauthorized or improper access under the CFAA, Ticketmaster and hiQ offer data scrapers hope that courts will continue to question whether the CFAA should prohibit harvesting publicly available data.  Companies who utilize data scraping should, however, consider that a court would be more likely to impose liability under the CFAA if the data collected is not publicly available or the methods used to obtain the data can more clearly be characterized as unauthorized access.  The Ninth Circuit is expected to hear oral arguments in hiQ in March, and the court’s interpretation of the CFAA is likely to have a significant impact on the use of automated processes to use third-party data.

OTHER THOUGHT LEADERSHIP POSTS:

GDPR Compliance and Blockchain: The French Data Protection Authority Offers Initial Guidance

By Linda Henry See all of Our JDSupra Posts by Clicking the Badge Below The French Data Protection Authority (“CNIL”) recently became the first data protection authority to provide guidance as to how the European Union’s General Data Protection Regulation (“GDPR”)...

D-Link Continues Challenges to FTC’s Data Security Authority

By Linda Henry See all of Our JDSupra Posts by Clicking the Badge Below On September 21, 2018, the FTC and D-Link Systems Inc. each filed a motion for summary judgement in one of the most closely watched recent enforcement actions in privacy and data security law (FTC...

Good, Bad or Ugly? Implementation of Ethical Standards In the Age of AI

By Dawn Ingley See all of Our JDSupra Posts by Clicking the Badge Below With the explosion of artificial intelligence (AI) implementations, several technology organizations have established AI ethics teams to ensure that their respective and myriad uses across...

IoT Device Companies: The FTC is Monitoring Your COPPA Data Deletion Duties and More

By Jennifer Thompson See all of Our JDSupra Posts by Clicking the Badge Below Recent Federal Trade Commission (FTC) activities with respect to the Children’s Online Privacy Protection Act (COPPA) demonstrate a continued interest in, and increased scrutiny of,...

Predictive Algorithms in Sentencing: Are We Automating Bias?

By Linda Henry See all of Our JDSupra Posts by Clicking the Badge Below Although algorithms are often presumed to be objective and unbiased, recent investigations into algorithms used in the criminal justice system to predict recidivism have produced compelling...

My Car Made Me Do It: Tales from a Telematics Trial

By Dawn Ingley See all of Our JDSupra Posts by Clicking the Badge Below Recently, my automobile insurance company gauged my interest in saving up to 20% on insurance premiums.  The catch?  For three months, I would be required to install a plug-in monitor that...

When Data Scraping and the Computer Fraud and Abuse Act Collide

By Linda Henry See all of Our JDSupra Posts by Clicking the Badge Below As the volume of data available on the internet continues to increase at an extraordinary pace, it is no surprise that many companies are eager to harvest publicly available data for their own use...

Is Your Bug Bounty Program Uber Risky?

By Jennifer Thompson See all of Our JDSupra Posts by Clicking the Badge Below In October 2016, Uber discovered that the personal contact information of some 57 million Uber customers and drivers, as well as the driver’s license numbers of over 600,000 United States...

IoT Device Companies: COPPA Lessons Learned from VTech’s FTC Settlement

By Jennifer Thompson See all of Our JDSupra Posts by Clicking the Badge Below In “IoT Device Companies:  Add COPPA to Your "To Do" Lists,” I summarized the Federal Trade Commission (FTC)’s June, 2017 guidance that IoT companies selling devices used by children will be...

Beware of the Man-in-the-Middle: Lessons from the FTC’s Lenovo Settlement

By Linda Henry See all of Our JDSupra Posts by Clicking the Badge Below The Federal Trade Commission’s recent approval of a final settlement with Lenovo (United States) Inc., one of the world’s largest computer manufacturers, offers a reminder that when it comes to...