The Datawake project consists of various server and database technologies along with a Firefox plugin that aggregate user browsing data via a plug-in using domain-specific searches. This captured, or extracted, data is organized into browse paths and elements of interest. The data can then be analyzed by the user or a team and used to seed crawlers.
Firefox Plugin showing entities extracted From the current page.
Graph of all extracted data from the pages visited by the user.
DataWake Depot Administrative Dashboard.
Applied Technology
This work was funded by DARPA’s Memex program and leverages several technologies from DARPA’s Open Catalog. DataWake is available on the Memex Open Catalog
Datawake utilizes the following DARPA technology
MITIE: MIT Information Extraction - MIT-LL
Topic Clustering - MIT-LL
Tangelo - Kitware