Time Series Data Quality
|Duration||Feb 2016 - May 2017|
Phase 1: Time Series Data Analysis
Distribution Network Operators keep historic time series data within a number of databases which can be interrogated on an ‘as needed’ basis, generally for planning purposes. Due to the sheer volume of data, errors, omissions and underlying trends are difficult to spot by relying on manual intervention alone. This project sought to investigate the use of data analytics to understand data quality and identify trends and issues which would be difficult to spot through human intervention alone.
Established ‘Big Data’ analytics techniques were used to process the huge amounts of time series data held for one licence area in order to identify data quality issues and emerging trends.
Several years’ worth of historic time series data for the whole WPD License area was taken for analysis. A set of repeatable and scalable processes were established;
- Identify gaps, errors, zeros, "stuck" values in data
- Identify suspect/defective data
- Create rules to replace missing/defective data
- Assign directions to power flows
- Identify the causes of suspect data through common patterns
- Conduct value add data analysis while doing the above to investigate some aspects of the potential uses for this data.
The project also looked at what other information could be incorporated in to the ‘Big Data’ analytics to further validate data quality.
- Produce repeatable processes that can accurately identify the conditions set out above
- Produce lists of actions regarding any defects found and pass to BAU for rectification under a separate programme of work
- Produce recommendations to improve time series data quality based on the outcomes of the project
- Accurate identification of the conditions set out in the scope
- Successful correction of defects found
- Improvements to business processes based on recommendations from the project
Performance against Objective and Success Criteria
The project identified large numbers of issues and classified these and passed them on to a new work programme charged with rectification. To achieve this, high level management engagement was sought and a commitment to the work was obtained. Large numbers of analogues have been adjusted as a result of this subsequent action. The analytical tools developed during the project have been made available to the wider engineering teams and are operated on request. Some of these (and/or components of them) have also proved useful in other applications, not just error detection in source monitoring analogues, but also for data visualisation and for some innovation project work.
A new tool which compared current sums on feeders and through transformers has also helped to identify where analogues were incorrectly configured and which failed to show attached generation.
Installation of the analogues is now carried out to enhanced instructions.
Phase 2: Time Series Data Tool Feasibility
Phase 2 of the project looked at the feasibility of developing an integrated Time Series Database for mainly analogue data gathered by the WPD IT systems. It is a fundamental requirement for the DNO transition towards DSO that there is a greater understanding and visibility of the historic and real-time energy flows, as well as a much more complete data capability.
The solution being sought would allow for permanent data rectification for known errors as well as providing data visualisation, analysis and reporting capabilities and it was initially intended to conduct a competitive tender for a pilot solution. By way or preparation a user engagement was conducted to gather requirements so that these could be mapped onto potential solutions and an initial solution landscape survey was carried out.
The project looked at developing a solution for improved collection, analysis and rectification of existing SCADA data used for network planning and control purposes.
The objective of the project was to establish the DNOs present and future requirements for the storage, analysis and interrogation of SCADA time series data together with its associated developed cost.
The project included the following main identifiable steps which can be assessed for completeness and an indication of successful conclusion:
- User engagement, Requirements gathering, user interaction allowing processing to next step;
- Requirements specification – production of a complete specification against which the existing market can be compared and potential suppliers may quote;
- Tender process resulting in identifiable solutions and costs;
- Selection of a preferred solution;
- Follow on instructions / details for following implementation project.
Performance against Objective and Success Criteria
The project proceeded through the User engagement and requirements gathering and specification phases and prepared and released these. In doing the work however, it was realised that there was considerable overlap with another innovation project called INM/CIM which is preparing a “big data” abstraction layer view at a level above the main WPD business systems (GIS System, Control System and EAM systems) and which includes details of “measurement” elements of the overall landscape. It was found that without a formal tender process to back the requests for information, that suppliers were not forthcoming in details about their systems sufficient to allow full assessment and at a level of detail beyond standard sales literature. As INM/CIM is continuing to a new level following initial success, we are associating this work with that project to take this element forward.