Digital Systems Engineering Framework
The four principles upon which the Framework has been conceived are: EPRI's Digital Systems Engineering Framework is a collection of guides and methods designed to work as a single, integrated process to manage the lifecycle activities for modern digital technology integration in nuclear power environments. This process addresses the emerging complexity of nuclear plant integration to achieve safe and reliable plants and brings together the key technical topics that are relevant to the functional safety, security, and reliability of digital systems.
The Framework's products use the same modern methods and international standards successfully utilized in other safety related industries to reduce implementation cost across the system and facility lifecycle. This has a positive impact on total digital systems lifecycle cost. The Framework enables control and monitoring innovation for new reactor designs as well as upgrades and retrofits of legacy plants.
- Utilization of Industry Standards - Non-Nuclear Safety I&C IEC Standards are utilized
- Use of Systems Engineering - Replacing the current engineering process used in the nuclear industry. This includes System-Theoretic Process Analysis (STPA)[1] as a key diagnostic method.
- Risk Informed Engineering - Providing credible graded approaches, effectively managing the resources and replacing more deterministic methods.
- Capable Workforce - Developing the knowledge, skills and abilities to leverage new methods and challenge us all to think differently.
Digital System Engineering (DSE) Framework Elements[edit]
The DSE Framework is an ensemble of guidance documents that function in an integrated fashion to guide practitioner in the different lifecycle activities necessary to safely and efficiently integrate Digital I&C systems in nuclear power plants. The figure below shows all the DSE Framework Elements and describes the information flows among them.
The Digital Engineering Guide (DEG) takes the input from the Lifecycle Strategy Guide and develops the design with a systems engineering approach. HAZCADS is EPRI's proposed method for analyzing the DEG's proposed design hazards and consequences, to identify the Unsafe Control Actions (UCAs) and Risk Reduction Targets (RRTs). UCAs and RRTs are passed down to the downstream processes along with the design elements for development of the loss scenarios for each technical discipline (Reliability, Cyber Security, Human Factors and Electromagnetic Compatibility). The downstream processes return control methods to the DEG to inform a new iteration of the design and subsequent evaluation cycle. Other design elements assist the practitioner with the development of requirements, testing, networks design and configuration management processes. Lastly the bounded design is used by the Digital Maintenance Guide (DMG) to propose the best maintenance strategies, methods and tools.
The subsections below provide a high level description of these elements and the table at the end of this section, contains links to their latest version in EPRI website.
Digital Engineering Guide (DEG)[edit]
The Digital Engineering Guide (DEG) is the backbone of the Framework and provides an actionable, systems engineering-based approach, structured around defined activities. The DEG guides engineers with planning, designing, implementing, testing, installing and operating and maintaining digital I&C systems and components in nuclear power facilities. The DEG includes a graded approach to all the required activities to achieve risk informed engineering efficiency.
To systematically propose a graded approach the DEG uses the following concepts:
- the system or component subject to the design configurability,
- the consequences of error in the design process, and
- the level of interest of the engineering design.
The DEG merges a variety of technical disciplines into an output product that achieves technical completeness. The activities required to address all the technical disciplines and their dependencies are described in detail throughout the different phases of the systems engineering process. The topics listed in the figure below are carried out throughout the project timeline in activities designed for each stage of the systems engineering Vee model.
Hazards and Consequences Analysis for Digital Systems (HAZCADS)[edit]
Faults and misbehaviors in digital I&C systems manifest themselves by how they influence controlled equipment, and risk models indicate that controlled equipment has varying degrees of importance relative to top events like core damage or lost generation. Prior research has shown there is no consensus method for determining an accurate estimate of the probability of systematic errors in digital I&C systems, which can lead to significant conservatism in the application of technical and administrative control methods to reduce the risk. Work published in EPRI 3002000509, has shown that Systems-Theoretic Process Analysis (STPA) is effective in identifying unsafe control actions (UCAs) in digital systems, and Fault Tree Analysis (FTA) is effective in identifying random hardware failures in digital systems and their sensitivity relative to top events.
HAZCADS is a method developed by EPRI that expands on prior work with the goal of producing a practical approach to addressing the risk of digital systems within a systems-oriented approach that leverages both STPA and FTA methods. The objectives of HAZCADS are:
- Establish an actionable risk-informed digital system hazards identification and assessment approach, structured around defined activities.
- Provide an integrated, production ready risk-informed hazard analysis approach that integrates qualitative and quantitative insights.
HAZCADS has been developed as a diagnostic tool of the design and interfaces with downstream processes that identify risk control methods for each causal factor to complete the digital design evaluation and refinement.
The Downstream Processes - DRAM, TAM, HFAM and EMCAM[edit]
DRAM: Digital Reliability Analysis Methodology[edit]
This guideline adds causal factor and reliability analysis that develops loss scenarios and control measures based on the designs produced by the DEG and the hazards, unsafe control actions (UCAs) and risk reduction targets (RRTs) insights developed in HAZCADS. The insights from reliability analysis can identify the most effective control methods to eliminate or mitigate vulnerabilities by increasing random capability and systematic capability of the design elements as needed. The proposed control methods can be scored and combined to reveal their total effectiveness and selected based on the desired RRTs for each UCA. Iteratively applying the reliability analysis from early in the conceptual design phase and continuing into the detailed design phases provides an organized refinement method that achieves a sound converged design rapidly.
TAM: Cyber Security Technical Assessment Methodology[edit]
Cyber security vulnerabilities and exploits are a digital hazard and reliability consideration that is best addressed using an engineering method. The Technical Assessment Methodology (TAM) guides the user through a methodical process that efficiently converges the assessment and mitigation activities to an effective result. The TAM develops a baseline system/device configuration that provides the assessment starting point and reduces the attack surface as low as possible while still achieving functional objectives. It further narrows the assessment to only the exploit objectives and attack pathways that are present. The TAM identifies the exploit mechanisms that remain to form exploit sequences that are used to select appropriate mitigating control methods. The TAM then applies the best control methods based on the methods’ effectiveness and implementation burden. These methods include intrinsic device and system features that can be leveraged to mitigate an exploit sequence and raise the exploit difficulty to a sufficiently high level.
HFAM: Human Factors Analysis Methodology[edit]
The Human Factors Analysis Methodology (HFAM) guide merges a variety of human factors methods and tools with the digital I&C engineering activities to produce Human System Interface designs that accommodate the users' abilities and limitations. HFAM develops loss scenarios for human unsafe control actions (UCAs) and adapts Human Reliability Analysis methods to evaluate that the proposed design achieves the desired levels of human reliability, proposing control methods to design Human Error out. HFAM provides criteria to systematically and consistently risk-inform the human factors program elements grading the level of effort in a commensurate manner that achieves the required risk reduction targets (RRTs).
EMCAM: Electromagnetic Compatibility Assessment Methodology[edit]
The Electromagnetic Compatibility Assessment Methodology (EMCAM) is a risk-informed I&C equipment assessment and control method allocation process designed to remove conservatism during equipment assessment where acceptable for a given application and tailor the testing scope, level and control method allocation for a given application. This can potentially reduce testing costs and supply greater decision-making flexibility during testing specification and assessment, as well as during the allocation of additional electromagnetic interference (EMI) controls. EMCAM enables a user to decide which technical or administrative control methods to apply during system design and assessment to achieve a risk reduction target (RRT) value and Control Effectiveness Profile (CEP) score provided from a corresponding hazards analysis process, such as Hazards and Consequences Analysis for Digital (HAZCADS). EMCAM uses the CEP score value and equipment location information to identify recommended EMC testing parameters (acceptance criteria, scope and limits) and additional controls for nuclear facility I&C equipment modifications.
DSE Framework elements access table[edit]
Title | Revision # | Description |
---|---|---|
DEG:Digital Engineering Guide: Decision Making Using Systems Engineering | 0 | Core Systems Engineering method synthesized from IEC-15288, IEC-12207, and IEC 15298. It includes all relevant digital system's lifecycle topics. The DEG takes strategic input from the Lifecycle Guide and formulates bounded design description. |
HAZCADS: Hazards and Consequences Analysis for Digital Systems | 1 | Risk Informed Digital Hazards Analysis using STPA and Fault Trees Analysis (FTA). Identifies the system hazards and associated Unsafe Control Actions (UCAs). FTA and risk matrices are used to formulate Risk Reduction Targets (RRTs). Implements Process Hazards Analysis (PHA)/Layers of Protection analysis (LOPA) from IEC-61511. |
DRAM: Digital Reliability Analysis Methodology | 0 | Random and Systematic reliability analysis. Synthesized from IEC-61508 and identifies Loss Scenarios and control measures forms part of LOPA. |
TAM: Cyber Security Technical Assessment Methodology: Risk Informed Exploit Sequence Identification and Mitigation | 1 | It is technical cyber assessment method. Identifies cyber security vulnerability classes developing Exploit Sequences to formulate the associated control measures to protect, detect and respond/recover from cyber UCAs meeting the required RRTs. |
HFAM: Human Factors Analysis Methodology for Digital Systems: A Risk-Informed Approach to Human Factors Engineering | 0 | Proposes a risk-informed HFE process that integrates HRA methods to evaluate the effectiveness of the Human System Interface design control methods to design-out human error meeting the required RRTs. |
EMCAM: Electromagnetic Compatibility Assessment Methodology | 0 | Identifies EMC vulnerability classes. Develops and scores protect, detect , and respond/ recover control methods using the RRT. |
Digital Systems Engineering: Requirements Engineering Guideline | 0 | Provides guidance on engineering actionable, bounded, and testable requirements. Describes requirements development processes that address the unique and integrated nature of digital system hardware and software, and allow integration with the DEG. |
Digital Systems Engineering: Configuration Management Guideline | 0 | Develops the strategy and methods to identify and manage hardware and software configuration items by:
|
Digital Systems Engineering: Test Strategies and Methods | 0 | Provide guidance on testing digital components and systems. The methods in this guideline are intended to be used in coordination with the DEG and are extensively mapped and aligned with the testing activities in the DEG. This ensures that each activity in the DEG with testing dependencies has extended guidance for that activity. |
Digital Systems Engineering: Network Design Guide | 0 | This guide expands and complements that engineering framework to enhance the design, integration, installation, and testing of both wired and wireless technologies in a nuclear industrial environment. This guide presents network use cases as examples that are typical for the nuclear industry. Each use case is presented with example options and reasoning to illustrate various tradeoffs. Interface tradeoffs between OT and IT networks are also illustrated. |
Digital Systems Engineering: Digital I&C Lifecycle Strategy Guide | 0 | Provides guidance on the overall system lifecycle and provide detailed guidance on elements of IEC-15288 not covered by the DEG. it helps users to get familiar with processes that enable Systems Engineering efficiencies such as standardization, enterprise architecture, and system architecture. It also provides templates for documents used to enable digital life cycle management. |
DMG: Digital Maintenance and Management Guide | Under development | O&M Phase Guide on maintenance and management of digital equipment |
Advanced Assurance Guideline: a Risk Informed Approach | Under development | Provide guidance for developing compelling assurance methods (Claim, Argument and Evidence approach) that can be adopted by industry and regulators alike. |
DEG Implementation in the US[edit]
The US Nuclear industry has adopted the NISP-EN-04 "Standard Digital Engineering Process" as the procedure that provides direction specific to digital changes for nuclear power plants. This procedure has been developed as the digital specific addendum to the IP-ENG-001 "Standard Design Change" under the same mandatory Efficiency Bulletin (EB 17-06) [2]. NISP-EN-004 includes the same process phases as IP-ENG-001, tailored with DEG-specific supplemental information for digital implementations, including Cyber Security. It instructs the user about “What to Do”. In this context, the DEG provides detailed guidance using a modern engineering process with digital design considerations, information item guidance, and division of responsibility methods to improve “skill of the craft". In summary the DEG instructs the user about “How to Do”. Completing this industry initiative, EPRI's Digital Training and Tech Transfer opportunities assist the industry in improving the quality of the digital designs through a scalable and robust technical framework.
Digital Systems Engineering Users Group (DSEUG)[edit]
The Digital Systems Engineering Users Group is a community and environment where users of the DSE guidelines and the associated training products can share their experiences with one another. The group also will enable users to share common design packages, cyber security assessments, training needs, and other digital engineering output that does not contain third-party IP. For more information on the DSEUG follow this link.
Notes[edit]
Record of Revisions[edit]
Number | Date | Description of changes |
---|---|---|
0 | August 2024 | Initial release |