DCMetaSpider is a system, currently under development, that can spider a collection of URIs, retrieve any metadata present, and then present this metadata for review by the content owner. It is a companion project to Dublin Core for Drupal, an implementation of Dublin Core metadata for the Drupal CMS. The two projects share a basic database schema which covers most of the provisions of Dublin Core, but is infinitely extensible to allow the incorporation of other metadata schemes (Dublin Core being a scheme, in this context) and predicates (equivalent to Dublin Core terms).
The essential components are a Web robot, written in Perl and using standard Perl modules, a MySQL database to store the data and Perl CGI programmes that allow the user to control the robot and review the data through a Web interface.