Developing Electronic File Structures: ARMA TR 23-2013
Let's see what's in this standard.
In the rationale we see that this standard is about the development of electronic file plans and should be used in conjunction with ISO 15489-1. That's good because I also got a copy of that standard via ILL.
Apparently a file plan is "a classification scheme that defines and identifies all files,including indexing and storage of the files, and referencing the disposition schedule for each file." It introduces some interesting concepts such as "data atlas", "data map", etc.
The standard sets up two axes of analysis: structured vs. unstructured, and record vs. non-record. It also presents different interpretations of the records lifecyle:
Information lifecycle model
- distribution and use
- storage and maintenance
- retention and disposition
- archival preservation
Progression of actions model
- storage and maintenance
Three ages model
- current records
- semi-current records
- non-current records
Vital records are particularly important. Long term preservation may be particularly challenging. A controlled language is also very important (e.g., controlled vocabularies, glossaries, indicies, ontologies, taxonomies, and thesauri). These tools are important for standardizing naming conventions, reducing duplication, etc. The standard gives some pretty blunt advice on terms:
- hire a consultant to create a controlled language tool
- locate an organization with similar functions and purchase the rights to use and customize its tool
- collaborate with similar organizations on the development of controlled language tools
- locate and use publicly or privately created controlled language tools
Lambe's Organizing Knowledge would be a pretty good reference here!
The standard then applies particular attention to taxonomies: "A taxonomy can be viewed as a type of geographical representation of an organization's controlled vocabulary... Since users in an organization possess varying levels of familiarity and skills, taxonomies help reduce information overload by providing multiple contextual cues ranging from functional to subject or series identification." The given example is functional:
- 0400 Financial management (FUNCTION)
- 0405 Accoutning and revenue (ACTIVITY)
- - 05 Accounts payable (SERIES/SUBJECT)
There are, of course, challenges with big bucket schemes and the standard references ANSI/NISO Z39.19-2005 (R2010). It notes that "one of the key features of taxonomic design is that it supports information-seeking behaviors such as browsing and searching."
The standard articulates some of search approaches:
- boolean search
- full-text search
- keyword search
- metadata search
- natural language search
- phrase search
- proximity search
- SQL search
- truncated search
- wildcard operator search
Of course, we could segue into a significant conversation related to search engine tuning and optimization, its impact on information architecture, etc.
We then move on to classification:
"To classify is to create order out of chaos, to put like things together." The challenge occurs when there is a lack of controlled language which might lead to non-functional classification... which isn't good. This issue leads us into naming conventions. Considerations include:
- order of terms in file titles
- use of spaces and special characters
- format of dates or other numerical information
- acronyms or abbreviations
- document type labels
- version identifiers
- addition of confidential, draft, or final status designators
The standard recommends using the information lifecycle as an approach for determining the type of metadata to be assigned. "Common metadata elements" include:
- unique identifier
- data of capture
- document type
- taxonomic attributes
- subject headings
- retention information
- change logs
- logical format
- file format
- other technical specifications (7,18,19)
Morville and Rosenfeld's Information architecture for the World Wide Web also has some decent guidance on this topic and the Dublin Core is always a good reference.
Some good advice: "To ensure that a metadata protocol is consistently applied and not overly burdensome to organizations, metadata creation and collection may be automated or inherited from higher levels of the taxonomy, whenever possible." Metadata has to consider business drivers, risk factors, length of retention, regulatory environment, format specification, legal liability, and IT infrastructure.
Ultimately, the project has to involve business management, records and information management, and IT. There is also a need for policies and procedures, typically:
- Employee (HR) policy or handbook
- Employee use of infrastructure policy
- Information governance policy
- Information governance policy
- Intellecutal property policy
- Internet policy
- Mobile device policy
- Privacy/security policy
- Records policy
- Social media policy
Now that we're into the process of actually creating the file structure we have to do a few things:
- The records inventory. Interview and survey to develop a list of the organization's records, including types, location, classification systems, and usage data. Consider email, storage, physical records, application databases, archives, backup, etc.
- Create a data map/data atlas.
- Data map. This is where we get more details. The data map is more than a records inventory. It is a "sweeping, defensible depiction of an organization's electronically stored information (ESI)." Include systems, media, business units, stewards, and custodians. The map gives one level of granularity but enables analysis. For example, it could facilitate a discussion related to the location of recruiting records. During eDiscovery, for example, a data map could be produced to identify potentially relevant ESI.
- Data atlas. It's more than a data map. It's a catalog of ESI and "may consist of maps, charts, lists, spreadsheets, tables with supplementary illustrations, databases, and analyses."
- access and security controls
- disposition procedures
- legal holds and eDiscovery
- organizational understanding of RIM/training
- physical and electronic file plans
- policies and procedures related to RIM
- retention and vital records schedules
- roles and responsibilities related to RIM
- identification of persons/timelines associated with file plan revision
- identification of triggers for revision/modification
- examination of continued relevance of controlled language tools
- adequacy of storage infrastructure
We then get into a discussion of storage which is... not terribly informative. We have some information on media types, obsolescence, conversion and migration, DR and BC, etc. Smallwood's Information Governance book is more informative on the topic.
An interesting comment on shared drives: "Nearly every organization has network shared drives offering file storage for its employees. In many instances, however, RIM policies for the use of these drives are non-existent. If policies are in place, compliance and oversight are inadequate, allowing individual departments to develop information silos rife with duplicative and unnecessary files."
The guide continues with some concerns related to access, noting best practices by the American Health Information Management Association (34) , the International Association of Privacy Professionals (35), NIST, and ISACA. There is also a bit of commentary on security and privacy regulations. Again, Smallwood is probably a better resource.
FRCP 26 and 34 get some mention. It notes that FRCP 30(b)(6) allows an organization to name an individual to testify on its behalf. A file plan is a good resource for them. Other resources include the EDRM, IGRM, GARP, Sedona principles, etc.
Then there's a technology discussion: ECM, EDMS, ERMS, etc.
Section 9 is about change management and training. It promises that Appendix A will serve as a guide. Key principals in change management include executive support, communicating the need for change, establish a change process including training. Communications plan could consider social platforms, newletters, and promotion campaigns. There is some guidance for training but it seems pretty crappy. Apparently the American Management Association and the American Society for Training and Development have better guidance (a BOK maybe?).
Appendix A provides a WBS for a project.
Appendix B offers a pretty good case study.
The standard is interesting but it really doesn't give us a template for a data map/data atlas... which would be nice.
- ARMA TR 22-2012 Glossary of Records and Information Management Terms.
- ANSI/ARMA 5:2010 -- vital records programs: identifying, managing, and recovering business-critical records.
- ARMA -- Controlled language in records and information management
- ISO 5963:1985 Documentation -- methods for examining documents, determining their subjects, and selecting index terms
- ANSI/NISO Z39.85-2012, The Dublin Core metadata element set
- ARM TR3-2009, Metadata: a basic tutorial for records managers
- ARMA Retention management for records and information.
- ARMA Guideline for outsourcing records storage to the cloud.
- ARMA Records management responsibility in litigation support
- DoD 5015.02-STD
- Naming conventions for electronic documents (Alberta Government Services)
- Best practice for file-naming (North Carolina Department of Cultural Resources)
- Managing electronic records in shared network drives -- good practice guidance. University of Stirling Records Management Office.