Documentation for collections data from Science Museum, National Media Museum, National Railway Museum (NMSI) released as CSV
About this data
These data sets contain information about objects from the collections of the Science Museum, the National Media Museum and the National Railway Museum. These datasets include many items not on display in our galleries, as well as authority records about related people and organisations, events and image files.
The collections include objects relating to aeronautics, agriculture, astronomy, cinematography, medicine, materials, space, television, time measurement, transport and more. They range in size from contact lenses to Concorde 002.
We've published three data sets:
- 218,822 object records (currently in 4 files, each up to 15mb) (NMSI_object1_20110304.csv, NMSI_object2_20110304.csv, NMSI_object3_20110304.csv, NMSI_object4_20110304.csv)
- 40,596 media records (metadata about images already published online) (NMSI_media_20110304.csv)
- 173 event records (NMSI_events_20110304.csv)
We hope to publish our lists of c9000 people and organisations related to these objects soon, alongside a table linking objects to events.
The data is supplied in CSV (comma-separated format, exported from Excel). The first line of each file contains the field headings. Files may be up to 15mb in size.
The data is released under the Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) licence (http://creativecommons.org/licenses/by-nc-sa/3.0/). Please contact us if you would like to use this data under different conditions.
Why we're releasing the data
We have been providing access to a searchable database of our collections online at http://collectionsonline.nmsi.ac.uk/ for some time now, but through staff attendance at various hack days, we've learned that this interface does not support programmatic search or exploration of the data. We've also learned (through the Cosmos & Culture project) that a number of people found the XML provided by the default .Net service that published the API too complex. CSV is a very simple format, accessible to a wider range of people. We hope that it will be usable by most people.
We're publishing the data in CSV format now as a relatively lightweight experiment. We'd like to understand whether, and if so, how, people would use our data. We'd also like to explore the benefits for the museum and for programmers using our data - your feedback would inform decisions about future investment in more structured data as well as helping shape our understanding of the requirements of those users.
We hope you will be creative with it, but please use it responsibly. If you're not sure whether the museum would be comfortable with your idea, please drop us a line to discuss it.
How you can help
You can help us to improve this resource - let us know if you have any information about our objects, or if you find any errors, though we will probably not republish this data set in the short-term. Please quote the Object Number/s and email: Collections.Online@nmsi.ac.uk
We'd like this experiment to help us understand the needs of potential users but we can only do that with your help - we'd love to hear your comments on how you've used the data, and how we could improve it. If possible, we'd like to feature mashups or other applications made with our data. Please email us at web.team@nmsi.ac.uk, send @sciencemuseum a message on twitter or leave a comment at http://sciencemuseumdiscovery.com/blogs/museumdev.
Objects
NMSI_object1_20110304.csv, NMSI_object2_20110304.csv, NMSI_object3_20110304.csv, NMSI_object4_20110304.csv.
Column title | What is it? |
ID_NUMBER | The unique identifier for a record, based on the museum's own accession number. The number may refer to a single object or (historically) to a collection of objects. |
ITEM_NAME | Object name - a simple name or common name. Where possible this is from an established thesaurus (i.e. http://museum-api.pbworks.com/f/NMSI_draft200903_object_name.csv) |
TITLE | A short one-line caption or brief description of the object, derived from the existing data. The title should be a summary capturing the essence of an object. Often includes related place and date. |
MAKER | The name of the person or company or other organisation that made the object. The Maker field is indexed and linked to the People/Organisation records (to be released shortly) - links should be made by matching strings (internal IDs are not available). |
DATE_MADE | The date when an object was made (production date). Dates should be recorded consistently and ranges should be in the format <earlier year>-<later year> e.g. 1671-1700. Approximate dates are written as e.g. c. 1936. This field also contains various strings, including ‘Unknown'. |
PLACE_MADE | Place names are indexed in the database and linked into a hierarchy (Getty Thesaurus of Geographic Names with in-house modifications i.e. http://museum-api.pbworks.com/f/NMSI_draft200903_place.csv) and should be recorded consistently because they are derived from a term list. Where known with certainty or reasonable probability the town or city of production is recorded. As a minimum the nation/country of origin or the probable nation/country of production should be recorded. If there is some uncertainty this can be explained in the general description. |
MATERIALS | Records what the object is made of and what part of the object is made of that material. |
MEASUREMENTS | Record the type of measurements that are most useful for an object, with ‘overall' being the most usual dimensions recorded. Overall will be the amount of space the object takes up when it first arrives in the museum and is stored. Measurements must be recorded consistently in metric units. Compulsory measurements are Size and Weight. The default units of measurement are millimetres and kilograms. Example: overall: 51 mm x 95 mm x 80 mm, 0.371kg, |
DESCRIPTION | In this field we try to describe what the what, when, why, where, who information about the object, what it is, what it does, is made of, who made it, where was it made and what makes it unique. This field should be exported as plain text (without markup). The information here is used by the museum to audit an object so it should be described well with each part defined. It should also contain all the information about the object so that an interpreted description can be written (suitable for publication). Technical terms have been avoided as far as possible. Names, dates, places and significant events should be recorded here in a normalized form but will also be recorded in other indexed fields. As far as possible the following are recorded: <number of objects> <name of object, qualifier> <model name, number> <what is the type of object?> <specific information>:<made by…> <type of object> <place made> <date made> <any associated relevant fact> <materials> <colour><serial number><containers> <accessories> <dimensions> <condition and completeness> <identification of parts> <acquisition/provenance information> <story of display, conservation etc.> <other details> |
WHOLE_PART | Mostly an internal field. |
COLLECTION | A broad subject specialism applied during the Acquisition/ Entry process. NMeM National Media Museum NRM National Railway Museum SCM Science Museum. Collection terms are listed at http://museum-api.pbworks.com/w/page/36515349/NMSI-Collections-list |
For more information on authority records, see http://en.wikipedia.org/wiki/Authority_control
Media
This table contains information relating object records to images already published online at http://collectionsonline.nmsi.ac.uk/.
You can use it to construct URLs to images of the objects. (The images are hosted on a site built with a third-party solution so the URLs aren't ideal.)
objects.ID_NUMBER is the equivalent to media. OBJECT, giving you a link between the object and media tables (e.g. 1999-719). The media. MEDIAKEY (e.g. 125972) can then be included in a URL, e.g. the image file URL uses the media key: http://collectionsonline.nmsi.ac.uk/grabimg.php?wm=1&kv=125972
Column title |
What is it? |
MEDIA_ID |
e.g. 10327065.jpg |
OBJECT |
The object ID_NUMBER e.g. 1999-719 |
MEDIAKEY |
e.g. 125972 |
CAPTION |
Optional. E.g. ‘Class 84 locomotive at Barrow Hill, sanding and filling in progress, August 1984' |
Events
Currently this data set has fairly random coverage but we would be interested to see whether people find the content useful. If the object was linked to any significant event (historical, political, developmental or other milestone events) or if an object featured at some significant and well-known event or activity, it might be recorded in this table.
Column title |
What is it? |
Event Name |
Includes location and date/date range. |
Event Short Name |
Event title without location or date (usually) |
Event Category |
Values include era, war, exhibition, expedition (term list?) |
Occurrence Type |
E.g. one-time, periodic, annual. Optional |
Event Start Date |
Single date as year or y/m/d. Mixed formats (sorry!). Also includes BCE dates expressed as negative integers e.g. -3100 Optional |
Event End Date |
As for Event Start Date. Optional |
Display Date |
? |
Duration |
Integer - use with Duration Unit. Optional |
Duration Unit |
E.g. days, months, years. Use with Duration. Optional |
Event Description |
Text. Optional |
Description Source(s) |
May be a URL. Optional |
Sort Name |
Internal use version of event name |
Produced for the Science Museum, London. Last updated by Mia Ridge, March 2011. With thanks to the web, database and documentation teams at NMSI for their support and assistance. Thanks also to @rboulton for testing the documentation.