Categories: Technology

AWS Discovers Perplexity’s Web Scraping Violation: Is This Legal?

An investigation by Amazon Web Services (AWS) has discovered that Perplexity, a company that uses AWS servers to train their AI models, is using web scraping techniques to collect data from certain websites. Web scraping, or data scraping, involves using software to extract HTML code from web pages and filter and store information automatically.

Developer Robb Knight and Wired uncovered evidence that Perplexity violated the Robots Exclusion Protocol by scraping data from specific websites without permission. The Robots Exclusion Protocol requires website owners to place a robots.txt file on their domain to specify which pages should not be accessed by robots and crawlers.

AWS has strict policies prohibiting customers from engaging in illegal activities and must comply with all applicable laws. Perplexity claims to adhere to the Robots Exclusion Protocol and states that their services do not violate AWS terms of service, except for rare cases where their bot ignores robots.txt in order to retrieve specific URLs.

However, investigations by Wired suggest that Perplexity’s chatbot sometimes ignores robots.txt in order to collect unauthorized information, raising concerns about potential violations of AWS terms of service and the legality of Perplexity’s data collection methods.

Samantha Jones

As a content writer at newsnnk.com, I weave words into captivating stories that inform and engage our readers. With a passion for storytelling and an eye for detail, I strive to deliver high-quality and engaging content that resonates with our audience. From breaking news to thought-provoking features, I am dedicated to providing informative and compelling articles that keep our readers informed and entertained. Join me on this journey as we explore the world through the power of words.

Share
Published by
Samantha Jones

Recent Posts

Discovering a Gigantic Salamander-Like Predator in Africa: Unveiling the Complexity of Early Life Forms

Newly discovered fossils of a colossal, salamander-like predator have shed light on a creature that…

3 mins ago

Exploring Freshwater Acidification: Thunder Bay National Marine Sanctuary Hosts Hollings Scholar Charlie Azzarito

This summer, Thunder Bay National Marine Sanctuary welcomed Charlie Azzarito, a Hollings Scholar from Florida…

6 mins ago

Chelsea’s Ben Chilwell: Uncertain Future for the Injured Left-Back Amid Competition and Transition

Chelsea's left-back Ben Chilwell is uncertain about his future with the club due to ongoing…

10 mins ago

Revolutionizing School Security: The Benefits of Advanced Technologies in Educational Settings

In recent years, the district has invested heavily in security by installing a 10-gigabit-per-second, dark…

14 mins ago

Former World Cup Winner Marco Materazzi Slams Romelu Lukaku’s Decision to Leave Inter: A Fateful Choice with Lasting Consequences in the 2024-25 Scudetto Race?

Marco Materazzi, a former World Cup winner and Inter supporter, recently shared his thoughts on…

15 mins ago

West Texas A&M Meat Science Quiz Bowl Team Makes History with First and Second Place Finish

The West Texas A&M University Meat Science Quiz Bowl team recently made history by taking…

16 mins ago