Content Extraction System

An extraction service offered by researchers at the Technical University of Valencia (UPV) that automatically detects and extracts the different parts of a webpage (main content, template, menu, user comments, boilerplate…) using scraping.

Free demo available on request!

contentBeforecontentAfter

Move the slider to see the pre and post extraction

Main content

Extract the core information of a web page, such as the article, post or main block of text, without distractions.

contentBeforecontentAfter

Reviews or comments

Extract reviews or comments section of a web page, to analyze it or just read it.

reviewsBeforereviewsAfter