The Datawake project consists of various server and database technologies along with a Firefox plugin that aggregate user browsing data via a plug-in using domain-specific searches. This captured, or extracted, data is organized into browse paths and elements of interest. The data can then be analyzed by the user or a team and used to seed crawlers.

Firefox Plugin
Firefox Plugin showing entities extracted From the current page.

Build Forensic Graph Graph of all extracted data from the pages visited by the user.

Datawake Depot DataWake Depot Administrative Dashboard.

Applied Technology

This work was funded by DARPA’s Memex program and leverages several technologies from DARPA’s Open Catalog. DataWake is available on the Memex Open Catalog

Datawake utilizes the following DARPA technology

MITIE: MIT Information Extraction - MIT-LL
Topic Clustering - MIT-LL
Tangelo - Kitware

Datawake Integrates with the following DARPA Memex Products

Memex Explorer
Domain Discovery Tool
ImageSpace