Before & after XML to PBCore in ResourceSpace

I’m interested in learning about different applications of ResourceSpace for audiovisual digital preservation and collection management and wanted to explore PBCore XML data exports. Creating PBCore XML is possible in ResourceSpace, but it is dependent on each installation’s metadata field definitions and data model. Out of the box, ResourceSpace allows mapping of fields to Dublin Core fields only.

Before: default XML file created in ResourceSpace

After: PBCore XML formatting for data fields

There was talk on an old thread on the ResourceSpace Google Group about the possibility of offering PBCore templates, or sets of predefined PBCore metadata fields because one doesn’t exist currently. I did not create KBOO’s archive management database with all possible PBCore metadata fields, instead it was important for me to allow KBOO to enter information in a streamlined, simplified format without all fields open for editing. I can imagine that having a template will restrict users to enter data a certain way, and may not offer the best flexibility for various organizations.

ResourceSpace data created is flat, so it exports to CSV in a nice, readable way but any hierarchical relationships (i.e. PBCore asset instantiation; essence track and child fields) need to be defined with the metadata mapping and xml export file.

I learned some important things when building off of code from the function “update_xml_metadump”:

  • “Order by” metadata field order matters. Its easier to reuse this function if the order of metadata fields follows the PBCore element order and hierarchy.
  • Entering/storing dates formatted as YYYY-MM-DD makes things easier. In ResourceSpace, I defined the date fields as text and put in tooltip notes for users to always enter dates as YYYY-MM-DD. I also defined a value filter. A value filter allows data entered and stored as YYYY-MM-DD to display in different ways, such as MM/DD/YYYY.
  • It is important to finalize the use of all ResourceSpace tools (such as Exiftool, ffmpeg, staticsync) because this may affect use, display, and order of metadata fields.
  • I was incredibly challenged to figure out the structure of data in the database and how the original function loops through, in order to loop appropriately to put the data in a hierarchical structure. My end result is from “might” and not necessarily “right” meaning someone with more advanced knowledge of ResourceSpace could probably make the php file cleaner.  I ended up creating a separate function each time I needed special hierarchical sets of data, i.e. 1 function for the asset data, 1 function for the physical instantiation, 1 function for the preservation instantiation, etc. Each function is called based on an expected required data field. For example the preservation instantiation for loop will only run if a preservation filename exists.
  • Overall, if you know what you’re looking at, you’ll notice that my solution is not scalable “as is” but hopefully this information provides ideas and tips on how to get your own PBCore XML export going in ResourceSpace.

The work done:

1. Reviewed metadata field information and defined metadata mapping definitions in the config.php file

2. Created a new php file based on an the ResourceSpace function “update_xml_metadump” which exports XML data in their default template, and which also offers renaming tags mapped to Dublin Core tags.

3. Created a new php file to call the new pbcore xml metadump function, based on the existing /pages/tools/update_xml_metadump.php file

4. Ran the update file. XML files are exported to the /filestore directory.

This post was written by Selena Chau, resident at KBOO Community Radio.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s