Results of the ARA SAT consultation on Records in Contexts

Jenny Bunn Monday, 19 December 2016 07:50

Please find below the text of the response that was sent to EGAD.


Introduction – How this response was formulated.

On 17 November 2016, around 20 individuals from the UK (mostly archivists, but with one or two from museum/library contexts and one archival systems supplier) attended an event at the Wellcome Trust in London to talk about RiC-CM. Bill Stockting (a member of EGAD) gave a brief presentation on the model and then the participants broke into 4 groups to discuss the following questions;

  • Do you agree with the membership of the list of primary entities? Should anything else be included as a primary entity? Should anything be taken off this list?
  • Do you have any specific comments on any of the entities in particular, e.g. changes to wording, additional examples, confusion about usage?
  • Do you agree with the lists of properties for each entity? Should anything be added/taken away?
  • Do you have any specific comments on any of the properties in particular, e.g. changes to wording, additional examples, confusion about usage?
  • Do you agree with the lists of relations? Can you suggest further relations?
  • How should these relations be presented? What information do you need/would you like about each relation?

The results of the discussions were captured on paper and are recorded in the table in Appendix A. At the end of the discussions each group, was asked to highlight three main points arising from their discussions. These are marked with *** in the table.

In the afternoon, a smaller group (of around 8 individuals) tried to take a more practical view – looking through the properties for Record entity as if they were going to have to describe their material using them. They did this in two groups. One group used this exercise to generate more property comments (in the table marked #). The other started to list the properties they thought they might actually use in this situation. This is their selection, as far as they got:

*P1 Global persistent identifier

*P2 Local indentifier

*P3 Name (title or label)

P9 Scope and content (description)

P5 Authenticity and integrity note and P20 History (provenance and source of acquisition)

*P7 Content extent

P8 Quality of information (condition)

The ones marked with an * would be the absolute minimum for an individual item (not an aggregation).

Subsequently this report was compiled and its editor produced the following reflection.  

Reflection
Although the initial intention of the event was to come up with some answers to the specific questions being raised (see previous section), the main point at issue seemed to be that the group did not quite know how and in what way they should be engaging (and would in the future be expected to engage) with RiC.

The tenor of the group’s comments (and the direction the event took in the afternoon) seemed to suggest that they were engaging with it as a standard or metadata schema that would dictate what information they had to record about their collections. As such there were concerns about how resource intensive it would be to record all this information, what would or would not be considered mandatory and how it would help users/researchers. There were also concerns about how it would work or relate to other standards (including library ones such as RDA) and particularly perhaps with regards to technical metadata for digital records. This sort of metadata seemed to be held apart from descriptive metadata (both in the minds of the participants and in the practice they reported - in separate digital repository systems).

In this way, the group did start to engage with RiC at a more conceptual level. The group recognised that RiC was trying to encompass and allow for digital understandings of record, but also felt that this did not always work. For example, there was confusion about the distinction made between information, representation and carrier and attempts at translating the more ‘digital’ of the properties to analogue material produced mixed results. This feeling perhaps reiterates again a confusion between engaging with RiC as a metadata schema and as an expression of conceptual understandings of record. Conceptual understandings of record clearly need to be expanded in the light of and to include born digital material, but at the level of metadata, it is difficult to envisage that born digital material will not always need a more extended and perhaps different set (of metadata) than analogue material.

On reflection then, the main point we wish to report back to EGAD is that we are not sure if RiC is intended to be a) a metadata schema or b) an expression of conceptual understandings of the record (a conceptual model) or c) both. We are also not sure in which of these ways we (as the ARM profession) and others will be expected to or be able to engage with and profit from it in the future? Towards the end of the afternoon we did wonder whether a better approach might not have been for us to a) consider first the entities and relationships at that more conceptual understanding level and then, once that was sorted, b) consider the properties needed to describe those entities and relationships in a metadata schema mind set, but we did not have time to try this approach out.

We are aware that this response may seem somewhat limited, but we also feel that it is important for us to reflect back honestly to EGAD that we did find RiC problematic in this way. Appendix A does contain more specific comments on more specific points on which various degrees of consensus are displayed, but these comments have the most value perhaps in that the process of generating them (our discussions) helped us to work out what our underlying problem was (detailed above).

Postscript
As this report was written by the editor, but reports a group discussion, it was subsequently circulated to all those who attended for comment. This was to allow for the recording of any alternative, additional or supporting views from within the group. Where such comments were received, these have been included in Appendix B.


Appendix A – Comments recorded

General Questions

Is it for archivists or researchers?

Better layer – Entities, properties and relationships display together

Does help with digital records, but is that the focus elsewhere?

***Transition and buy-in. What does changing to this mean in practice and how do you change?

Introduction – is the language right?

Introduction – does it make clear that this is a Western govt style from a Western perspective

***Less text, more diagrams. The language is very difficult

How does it relate to other standards?

Is RiC too resource heavy to be practical?

***What is the difference between an entity and a property?

The examples provided worked well but the more specific the examples the better

Don’t make it too abstract otherwise people won’t use it.

What do users want? How does this serve the user?

How is this envisaged to work operationally?


Entities

Reasonable enough group

***Are all the entities mandatory? If so there are too many. If not there should be some ?more

***What is mandatory?

Where would a ‘project’ fit, e.g. the Human Genome Project?

Make Scope and Content and Access Entities in their own right so that they can have their own properties.

Need to model order, but this is probably a property of record

Entities need to be things that you can describe in their own right, e.g. the Battle of the Somme, the reign of Henry VIII.

Event - a missing entity?


***E1-3Record Component and Record Set are well received.

***E1-3 Can see potential for linking data across repositories but relies on similar identification of record sets/records.

***E1-3 A record component could be a record when digitised.

***E1-3 What is the difference between a record component that forms part of a record and a compound record? What is a compound record? Is this too similar to record set? What is the distinction? Is this helpful?

***E1-3 Support principle of record/component/set but wonder about practical implementation – which to apply in practice?

E1 How does the distinction between information, representation and carrier map onto the OAIS information model?

***E4 How many will be interchangeable across archives? Agents can span repositories. Will there be a central authority file? Will there be a controlled vocabulary for places etc? How do we define relationships?

E4-6 Agent, occupation, position – will lead to overlapping e.g. archivist so application will be specific to situation

E5 Is Occupation an entity or a property?

E6 Should position be a property in occupation? Appears to be duplication, e.g. the role of Headmaster generates the records irrespective of the agent. Why do we need position too?

***E7-9 Function (abstract) appears to be the useful one. Why do we need activity too?

***E7-9 Is there a need for function (abstract) and function?

E10 Mandate could be a useful (optional) thing to supplement provenance. Definition seems confused?

E10 Mandate – happy for it to be there, but would we actually use it? Some examples would be helpful, e.g. Henry Wellcome’s will. Multiple mandates, e.g. hold collections under mandate of Henry Wellcome’s will, close them under data protection act. Legal status of records – of wider mandate of archive services.

E11 Documentary form should be a property not an entity. Duplicating properties here – physical media etc.

E11 Documentary form: Could lead to misinterpretation? More a genre? Is this required? But what is alternative? We have no idea?

E12 Date is a property of something

***E12 Should dates be modelled in more detail, e.g. creation, last modified, birth, death?

E12-13 Date and place right as entities – especially re Linked Data.

E 14 Concept/thing – can we think of another title for this? An entity that could incorporate events? Could it be called subject or keyword like a tag to link multiple records? Needs controlled authority standard.

***E14 Sceptical of Concept/Thing. Certain concept/things, e.g. projects, brands, events, collective enterprises will require some kind of entity in the absence of Concept/Thing.


Properties

***Are they trying to do too much to apply to both digital and physical? The greater granularity required for digital does not work for analogue.

Lack of scope notes – makes it difficult to understand intended implementation

No reference to existing standards for comparison of content – will be important for adoption – helpful to reference applicable section of e.g. ISAD(G)

The ordering of the properties does not always flow, e.g. P12 should follow P10.

Some properties are very specific, others are very broad.

Add digital file format


#P1 – agreed it was good addition, but that it would require a change in practice

Do we need former identifier as a property?

#P2 – agreed, but questioned what you would do with former references?

#P3 – agreed, but questioned how this related to name authority files, e.g. what if the country was bi-lingual? Could you have parallel names in both languages? How is this dealt with? Do you simply repeat the property? Perhaps more of an issue for property when other entities, e.g. Agent and Place

P3 More used to title than name could we call it label?

#P4 – agreed. Can understand how to apply this.

Some find general note useful, some would delete it

Do we need general note? Should we call it local note instead?

#P5 Problematic – not sure how best to make it work. Overlap with P8.

P5 - Authenticity and Integrity note – needs more clarification in scope notes – between record and record set – practical implementation of difference between integrity and ‘quality’ properties

P5 – Authenticity and integrity note. Should there be separate authenticity and integrity notes? Do we really need a note for this? Are we recording data here that we are recording in other properties as well?

#P6 Problematic – overlap with P10, P12, P15? Could work as telling the user what they are getting, e.g. a map?

P6 Content type and P12 Media type should be linked

#P7 – time is measurable not countable.

P7 Content extent and physical extent. Improvement on current standards but definitions need to be clarified – the borderline needs to be clarified

P7 Is it helpful to make a distinction between content extent and physical or logical extent?

#P8 see P5. Also perhaps overlap with P16.

P8, P15 and P16 seem to be doing the same thing.

Is ‘Quality of information’ a good term? Subjective judgement of quality.

Add Quality of metadata as quality of information does not cover it

#P9 agreed. Can understand how to apply this.

P9 Scope and content has in the past captured many RiC properties. Is it helpful to say it may include many relations, aren’t those being captured differently – through relationships?

#P10 At the moment just Mime type. Useful for some analogue too. Could it just be a controlled term?

P10 Encoding Format- restricted to digital, but could be applied to analogue and AV. Needs to be broken down more but we can’t remember why

P10 Encoding format should also refer to analogue A/V material (film, video, tape)

#P11 agreed. Can understand how to apply this.

#P12 Pointless. Overlap with P6

P12 Media Type and P14 Medium may overlap but both need to be there. Need to improve the wording of this definition

#P13 Production technique. Could be used for maps/artwork, e.g. coloured, woodcut etc.

#P14 Hard to differentiate from P16

#P15 Distinction from P7 is meaningful.

#P16 Overlap with P14. Seems to equate to condition status?

#P17 Hard to make sense of this.

P17 Classification definition does not seem consistent with scope and examples

P17 If a record is part of multiple record sets does that mean there could be multiple classification terms?

#P18 agreed. Can understand how to apply this.

#P19 agreed. Can understand how to apply this.

P19 Wording of example could be better

P18 and P19 Conditions of access/use – assumed to be sequential activities – sensible but needs clarification, e.gs would be closed. TNA – scale of collection – requires more detail but RiC-CM may not be place for this

#P20 Like the widening of this to encompass what archivists have done to the record. Lack of examples.

P20 History – Could it be one shared amongst entities?

#P21 Confusing name. Need to rethink name for multiple manifestations (not necessarily original versus copy), e.g. different versions of born digital, multiple copies in a film archive.

***P36 Gender is important but should not be binary or time based (it can change over time)

***P36 Gender – strong views on both sides

***P36 Would gender be better as an entity? With a relationship agent identifies as? Usefulness for users’ requirements

***Ethnicity could be added

***P38-41 lopsided corporate body has certain properties but not others – are there others that may be useful – something that may evolve over time

P38-41 Not the appropriate place for contact info and opening hours

P64 Places being areas as well as a point. P64 helps in terms of politically charged descriptions of places, e.g. old colonies, contested lands


Relationships

Generally they look very comprehensive

We understand from Bill’s presentation that they only bi-directional relations are included. This would potentially involve tri- and quad-relations

Consider going towards a broader classification of relationships, making more flexible for local use

Needs ‘catch all’ option

Concept/Thing – needs to be able to be related to other Concept/Things in a more nuanced way

In relationships – still thinking in terms of copies rather than (digital) manifestations

***It would be best to avoid past and present tense and replace with date information

***Is/Was – Things that are current will not always be current

***Do you need different tenses represented in relations list?

***List too long – will people actually look at it?

***Too detailed is this level what we really want in the long term?

***Needs something (a website where you can select) more dynamic for presentation of relationships – requires a multi-dimensional standard for multi-dimensional descriptions/catalogues. E.g. have categories of relationships e.g. ‘creation’ that open up into more expansive lists perhaps referencing other standards, e.g. RDA

Agent relationships look very complete

Relation of agent to record. Could there by more relationships? Problems of creator field in ISAD(G). Use RDA – have many more options especially for films that have many more relationships to agents. Difficulties of mapping onto other standards/systems.

Perhaps have a family relationship instead of specifying all – in the way that the standard defines extent but lets you populate it how you like, the standard could present categories of relationships that an institution breaks down with specifics how they like using other standards/ontologies, i.e. RDA

Suggest to look at RDA

Include collaborates with


Appendix B – Additional comments received

#1
What I felt about the introduction is that whilst it is an interesting, well written, scholarly essay, I do not agree with all of the statements it contains, particularly the footnote on p. 3.  The first sentence is fine but the rest makes me quite angry with its sweeping generalisations and equivocal attitude towards records.  Any statement put in front of this kind of document needs to represent the views of the archival community as a whole.  

#2
This looks fine to me. You’ve done well to pull together a lot of meandering talking and thinking! I have no additions / changes.

#3
The aims and the purpose of RiC-CM are less clear than EGAD seems to have hoped. As I understand it, RiC-CM is intended to sit above and behind the four existing standards and tie them together, but I am aware that some colleagues in and beyond my own organisation have understood it to be more of a new or replacement metadata standard or schema. It would be very helpful to have this ambiguity clarified.

As it stands, the proposed model seems to be neither one thing nor the other. If the true point of RiC-CM is a conceptual model to create a coherent whole that draws on and feeds back into best practice, then the entities and properties need to be closer to those specified in the existing standards to facilitate its use. If, however, the point is also to try to meet particular needs that are widely thought to be not well met by the existing standards (e.g. for the description of born-digital materials), then the properties in particular are perhaps not different enough from existing standards to support this. Either way, explicit crosswalks to the existing standards are vital to spell out clearly where the differences lie. The first paragraph of section 1.8 states that ‘there has been extensive analysis of each of the existing standards’, so cross-referencing back to them should not be an unreasonable burden.

I have one other extra comment: I suggest adding a fourth role of records descriptions (1.6.4) to the existing three in section 1.6. This should acknowledge archival descriptions as a source of research data (and potentially even big data) in their own right, as well as something that supports the management, preservation and use of the records that they describe.

#4
I think the report strikes the right notes.