X Close

Data Management Planning for Secure Services (DMP-SS)


Just another Blogs.ucl.ac.uk site


G-CLOUD provider meets NHS IG Toolkit requirements

By F D ( Tito ) Castillo, on 21 May 2013

Following our earlier post that epilab-SS service now meets the NHS criteria for information security and governance (Level 2). We can confirm that AIMES Grid Services CIC Ltd, the data centre provider for epiLab-SS, have recently been notified that their submission to NHS Information Governance Toolkit team has been reviewed and found to meet their requirements. This means that, in addition to their pre-existing ISO-27001 certification and G-Cloud Assured Services, AIMES  now also meets the NHS criteria for information security and governance (Level 2).

AIMES status can be viewed here

epilab-SS status can be found here.

This will add to the dual certification (cloud/institution) model of information security assurance that we have been collaborating on and we look forward to improving it even further during future projects.

Information Security Explained

By F D ( Tito ) Castillo, on 26 March 2013


The value of information

Academic research involves the collection and management of information from disparate sources to build upon or refine a body of knowledge. Although research in itself should have some intrinsic value to society, the costs of the associated activities can also be considerable. These cost are not merely financial since they also involve time and effort as well as potential ethical compromises “for the greater good”, as in animal experimentation or placebo-controlled trials.

Since research is costly then it follows that the component parts that are derived from or support this activity must be of value.  In everyday life most people understand the need to protect valuables and typically carry out their own personal risk assessment to determine how to secure their own possessions; in many cases locking doors, shredding papers or employing trusted third-party services. Generally, this is done without consciously thinking about the process, adopting societal norms (or ‘standards’) in respect of most security-related decisions.

Organizations are not individuals, and cannot carry out this instinctive risk assessment without a helping hand from some man-made constructs.  James Reason, in his 2000 BMJ article “Human error: models and management”[1] elegantly describes the need to apply a System Approach, based on the assumption of the inevitability of human error and the need to adapt the conditions within which humans work rather than embarking on futile attempts to change the human condition.

“When an adverse event occurs, the important issue is not who blundered, but how and why the defences failed.” Reason, J (2000)

Just as Reason’s paper has fundamentally affected our approach to risk management in UK healthcare, so it should also highlight the wider issues in relation to the risk of information security incidents in all aspects of the research data life-cycle. It clearly articulates the rationale for well understood standards to support information security, what would commonly referred to as an Information Security Management System (ISMS).

A standard for Information Security (ISO-27001)

Although it is perfectly reasonable to attempt to implement an ISMS without reference to existing standards it is highly desirable to do so.  A standard provides a well-established framework drawn from past experience (and mistakes) of others. More importantly, Standards offer reference points against which systems may be benchmarked and audited.  Although it is not possible to measure security, it is possible to measure conformance to a prescribed standard. By adopting a suitable information security standard and being audited successfully against this it is possible to assure others that appropriate controls with associated governance are in place within an organization.

The internationally recognized information security standard is called ISO-27001[2] and forms the ‘requirements’ of an ISMS. Each of these requirements specifies things that ‘shall’ be done.  ISO-27002 is the associated Code of Practice for information security management [3] which describes what ‘should’ be done to implement the standard. The subtle distinction is that this second document simply provides recommendations for implementation of an ISO-27001 compliant ISMS.

ISO-27001 provides a taxonomy of 138 security controls plus an introductory clause introducing risk assessment and treatment. Each of the security categories contains one or more controls that are designed to meet the control objective. The controls that are described within the standard are not an exhaustive list and, depending on the results of risk assessment, not all controls will be required for a given ISMS.

Properties of an ISO-27001 ISMS


Any meaningful discussion of information security must begin with a simple question: ‘what are we seeking to secure?’  Although this may seem to be a trivial statement it is actually of fundamental importance in that the scope of the system must be defined, in other words the boundaries must be clearly described for the information to be secured.

The development of an ISMS that complies with the complete ISO-27001 standard is a major challenge for any organization and success depends clearly defining the scope of such a system; too small and the process is rarely cost-effective but too large and it may be unachievable. In practice, an initial high-level   risk assessment and cost benefit analysis should help to identify the appropriate focus for such a system.

Risk Assessment

The cornerstone of an ISMS is effective risk assessment.  Risk assessments are difficult to carry out and there is no silver bullet. The key point is that risk assessment is part of an on-going process of continuous improvement. In basic terms there are a series of steps that need to be followed.

  1. Identify the information assets that need to be protected.
  2. Identify any vulnerabilities that relate to these assets
  3. Identify threats that need to be guarded against.
  4. Estimate the likelihood of threats exploiting vulnerabilities (otherwise known as risks)

To be systematic you need to define a threshold level of ‘acceptable risk’ above which additional controls will be required.


The ISO 27000 series documents provide a taxonomy of 138 control that are appropriate along with guidance on their implementation. A key facet of all controls is that they need to be owned by someone (i.e. a responsible party or organization) and it should be possible to define means by which the effectiveness of each control may be assured and audited. The list of 138 controls is not intended to be exhaustive and it’s important to consider additional controls, if required, that are not explicitly referred to in the standard.

Statement of Applicability (SoA)

ISO-27001 prescribes the creation of a summary document that itemizes all of the 138 controls plus any additional controls and clearly states whether each control has been selected with reference to where evidence of the control can be found. Where controls have not been selected there should be clearly stated reasons for this. The SoA acts as a summary reference document that, taken in conjunction with the Scope Statement, should provide an auditor with a high-level view of an ISMS.

Dynamic characteristics

Like many similar management systems, an ISMS is dynamic and should follow the plan-do-check-act cycle (also known as the Deming Cycle). Made popular by Dr W. Edwards Deming, the father of modern quality control, the approach involves a process of continuous improvement through multiple iterations. It is worth noting that other management system standards, like ISO-9001, apply similar cyclical process models, and a suitably-designed ISMS should be able to accommodate many of the requirements of these other systems.

The standard outlines the requirements of each of these four steps in the cycle within concisely within just four pages before going on to provide requirements for:

  1. Documentation (including document and record control)
  2. Management responsibility in respect of their own commitment, provision of resources and programmes of training and awareness.
  3. Internal audit
  4. Management review
  5. Continuous Improvement, including corrective and preventive action

In practice, the dynamic aspect of the management of an ISMS is often the most difficult part to get right but this is where the iterative technique allows for successive improvement over time.


1.            Reason J: Human error: models and management. BMJ 2000, 320(7237):768-770.

2.            BSI: Information technology. Security techniques. Information security management systems. Requirements. In: BS ISO/IEC 27001:2005/BS 7799-2:2005. Edited by IST/33: BSI; 2005.

3.            BSI: Information technology. Security techniques. Code of practice for information security management. In: BS ISO/IEC 27002:2005, BS 7799-1:2005,BS ISO/IEC 17799:2005. Edited by IST/33: BSI; 2005.


Data management planning – Report on 18th July 2012 Workshop

By F D ( Tito ) Castillo, on 14 January 2013

Data management planning

As a research community, we all appreciate that our research data are important assets. A great deal of time and money is dedicated to collecting and processing them. But their value beyond the initial research lies in the ease with which they can be shared and re-used to support further research. Researchers and the institutions they are part of need to plan for the complete data life-cycle – from data collection through to archiving – to facilitate this and ensure their data can realize their full potential.

Data management plans: why do we need them?

For researchers, data management planning should be an integral part of their research planning, allowing research projects to be more accurately costed and resourced.

For institutions, having standard data management processes and clear guidance for researchers on data management allows the institution to be confident its research retains value and safeguards against reputational damage.

UK research councils are placing increasing emphasis on the importance of publicly-funded research being shared in a timely way. To ensure that grant applicants are able to meet this requirement, a data management plan (DMP) is now required for grant applications by most research councils.

Meeting data management requirements

Data management poses a number of challenges to researchers and to institutions. Data storage requires infrastructure with capacity we could hardly have anticipated a few decades ago, and with it come the associated costs of hardware, facilities and energy.
Ensuring that data collected has adequate metadata using suitable metadata standards (for example Data Documentation Initiative (DDI), widely used in social science) is another element that needs consideration if data is to be easy to find and share. Not all areas of research have adopted common metadata frameworks as yet, although in some academic areas these are already well established.

Then there are issues around security that need to be considered, including controlling access to data, protecting confidential data and ensuring data is backed up appropriately.

Different areas of research have different requirements in each of these areas. Some, such as astrophysics, require far greater storage capacity and computational power, while others, such as health, need to consider patient confidentiality and often require far higher levels of data security.

What standards are required for data management?

Research funding councils do not stipulate specific standards for data management at this stage. They do require that it meet generally accepted standards and follow best practice.

Relating to this, it seems unlikely that funders will have the resources to check that researchers are complying with their DMPs. However, researchers who do not follow best practice risk significant damage to their reputations, and institutions risk losing valuable assets. It is in everyone’s best interests that good data management systems and standards are supported.

There are certified standards for data management and data security that could be used in some areas of research. Although externally audited information security management systems is not currently a requirement, internationally recognized certification would provide independent assurance of a high level of risk management and data security.

Making DMPs more than just a box-ticking exercise


To support researchers in data management planning, the Digital Curation Centre (DCC) has developed DMPonline, a web-based tool for creating DMPs. The tool was developed to incorporate the data management requirements of all the UK research councils. By mapping each council’s requirements to its 118 questions DMPonline allows researchers to create tailored DMPs.

Developed on an open source platform, DMPonline has the potential to integrate with a number of different systems. It also has the capabilities for research councils to manage their specific guidance for each question.

Bridging the gap between researchers and local data services

Data management is not a task that can be undertaken by researchers in isolation from service managers and data resource planners. For data management to be effective it requires everyone to work together through every phase of data management planning, and this is where well thought through DMPs would be invaluable.
For researchers in large institutions, finding the right person to speak to about data storage and processing can be time-consuming and confusing. Having clear guidance on who to contact could save time and streamline the production of DMPs.

Researchers also need to understand what the best practice for managing and securing their particular type of data is. The resilience and accessibility of data has implications on data storage costs. For example, could your data be backed up to tape rather than to another server? Retrieving it from tape in the event of system failure would take a couple of days, but the cost of the storage would be significantly less.

DMP tools like DMPonline have the potential to be used by institutions as a tool in bridging the gap between researchers and data services. They could either be hosted by the institution, or hosted by DMPonline and customized to incorporate the institution’s administrative requirements as well as those of particular funders. The contact details for the person to contact on different aspects of the data management planning could then be provided as part of the relevant guidance notes associated with specific questions. Additional questions around capacity and timings could be incorporated, allowing IT departments to better manage their resources.

There is also the potential for institution- or department-based systems to share information with DMPonline .

Data preservation: what’s involved?

How data is archived in an immutable form is the final phase in the data life-cycle. Most funders require that data be preserved for between five to ten years beyond the time-frame of the project, although some require longer. However, there is no specific guidance for this.

To facilitate sharing and re-use, data need to be in a format that won’t degrade and they need to be deposited in data repositories that are relatively easy to search and retrieve files from. This requires agreement on common metadata standards and the inclusion of the metadata in catalogues that other researchers can find easily.

A number of leading research publications now also require details on data management for published articles and for data to be available for scrutiny for a certain time after the publication date.

Although the cost of archiving data is far less than that of storing ‘live’ data, how this preservation is funded is under debate. Some councils do expect to see costs for data preservation included in the grant application, and some have their own data depositories, while other funders see data preservation as the responsibility of the institution.

Data management at UCL

At UCL initial estimates from the newly formed Research Data Project are that the institution will require 2 petabytes (1 petabyte=1,000 terabytes) for collecting and processing ‘live’ data, and another 2 petabytes for archiving research data at the end 2014. However, this will not comprise all the institution’s data, much of which is currently held on departmental networks. Exactly how the system will be funded in the long-term is still being considered.
UCL generates vast quantities of research data, but researchers are often not clear on where their data needs to be stored or who to talk to about DMPs. Besides upgrading Legion (UCL’s current platform for computationally intensive research), there are a number of initiatives underway to investigate options for improving data management and to cope with the volume of data that is being produced.

The Research Data Project is in the early stages of considering how to increase the institution’s capacity for the storage of live data and for archiving data. Many departments do not have the funds to support this themselves. The Economic and Social Research Council (ESRC) now also requires all institutions to have a road-map of how they will be supporting data management in the future and UCL is developing a Research Data Policy which defines standards and identifies roles and responsibilities in the creation, storage and archiving of research data at UCL.

On a departmental level, epiLab-SS (a secure computing service at the UCL MRC Centre of Epidemiology for Child Health), is looking at how data management planning and information security can be integrated, and creating a single secure system for data collection and management. epiLab-SS, is currently entering the second stage in its bid to become an ISO27001:2005 Certified Data Centre (certification awarded Sept 2012) and the JISC-funded DMP-SS (Secure Service) project is looking at using DDIv3 as a metadata standard that could be used for marking-up the entire data lifecycle for epidemiology and public health research, from planning through collection to archiving. The information regarding data management could then be shared with DMPonline to generate DMPs.

Looking at data management challenges beyond the institution, UCL’s Medical School is part of a bid to set up an MRC e-Health Informatics Research Centre which will look at how NHS and research data can be integrated to support research (awarded in July 2012 and due to launch 1st May 2013).

In conclusion

Managing data is a complex undertaking. It would seem that infrastructure, costs and responsibilities for data preservation are issues that will take some time to resolve within institutions, especially those as large as UCL.

However, DMPs have the potential to support better resource planning and project costing if they become more than just a box-ticking exercise. For this to happen institutions need to put in place systems that encourage communications between researchers and data services and ensure that researchers understand the importance of data security and what this entails.

This report is based on the presentations and discussions around data management planning and public health research held at the Institute of Child Health on 18 July 2012. Organiser: Dr Tito Castillo, Senior Information Systems Consultant, MRC Centre of Epidemiology for Child Health