IEC 61508ED2: series 05 -- System and Hardware

来源:公众号“汽车安全前瞻研究”
2020-06-08
1535

[Author]

Renhong WENG, Safety, and Security, and RAMS investigator.


First: hardware safety integrity

there are 2 parts:

1.architectural constraints on hardware safety integrity

Route 1_H based on hardware fault tolerance and safety failure fraction concepts

Route 2_H based on component reliability data from feedback from end users, increased confidence levels and hardware fault tolerance for specified safety integrity levels

2.requirements for quantifying the random hardware failures

Second: systematic safety

there are 3 routes for systematic safety:

Routes 1s: compliance with the requirements for the avoidance of systematic faults, and requirements for the control of systematic faults

Routes 2s: compliance with requirements for evidence that equipment is proven in use

Routes 3s: OK for pre-existing software elements only.


Third: fault, error, failure type introduction

Failure type
IEC 61508
ISO 26262
fault
abnormal condition that may cause a reduction in, or loss of, the capability of a functional unit to perform a required functionabnormal condition that can cause an element or an item to fail
fault avoidanceuse of techniques and procedures that aim to avoid the introduction of faults during any phase of the safety lifecycle of the safety-related system
fault tolerance

ability of a functional unit to continue to perform a required function in the presence of faults or errors

**problem of only one actuator to be single point fault, though the random hardware failure can meet the given ASIL or SIL requirments, but the hardware safety concept of R_H1 not fulfilled

ability to deliver a specified functionality in the presence of one or more specified faults

**this definition is not the essence of fault tolerance.

**problem of only one actuator to be single point fault, though the random hardware failure can met the given ASIL or SIL requirements, but the hardware safety concept cannot meet the systematic failure requirements, that is the problems in ISO 26262

failure

termination of the ability of a functional unit to provide a required function or operation of a functional unit in any way other than as required.

Only the function unit, and different layers had been defined, easily to be understood.



termination of an intended behaviour of an element or an item due to a faultmanifestation

not good enough for the vocabulary to centralized, like element, component, system, etc which had caused inevitably misuse in some scenarios

random hardware failure

failure, occurring at a random time, which results from one or more of the possible degradation mechanisms in the hardware

**Also the random hardware failures induced both by design, BSR, environment, and production, commissioning, operation, decommissioning, deposites, etc, so the SN 29500, FIDES, and as well those including the failures in the production, etc, we prefer the IEC 61709-2017, will not use the IEC TR 62380 any more

**Preferably, the reliability focus on beta=1, but the times will cover from between B20-B80 of the life time confidence level, in 20% and 80% percent of the focus. (personal expectation)

in 0~20%, quality issue, ppm shall be lower than 100ppm by manufacturing cost, problems in design, manufacturing, product, EOL, etc

in 20%~80%, constant hardware failure, at that time, beta=1, caused in random hardware fault, PFH or PFH shall be calculated in this periods, for the target functional safety design if can meet the reliability hands-on target

in 80%~deposite, beta>1, aging, and environment factors

failure that can occur unpredictably during the lifetime of a hardware element and that follows a probability distribution

**ISO 26262 did not reveal the essence of the random hardware failures

systematic failurefailure, related in a deterministic way to a certain cause, which can only be eliminated by a modification of the design or of the manufacturing process, operational procedures,

documentation or other relevant factors


**same with ISOs

failure related in a deterministic way to a certain cause, that can only be eliminated by a change of the design or of the manufacturing process, operational procedures, documentation or other relevant factors
dangerous failurefailure of an element and/or subsystem and/or system that plays a part in implementing the safety function that:
a) prevents a safety function from operating when required (demand mode) or causes a safety function to fail (continuous mode) such that the EUC is put into a hazardous or potentiallyhazardous state; or
b) decreases the probability that the safety function operates correctly when required

N/A, ISOs didnot have the similar way of dangerous failure, but have rather in comparison is following failures may included in dangerous way:

-single point failure

-residual failure

-multipoint failure


but the safety-related failure is not directly dangerous failure

Safe failure
failure of an element and/or subsystem and/or system that plays a part in implementing the safety function that:
a) results in the spurious operation of the safety function to put the EUC (or part thereof) into a safe state or maintain a safe state; or
b) increases the probability of the spurious Ioperation of the safety function to put the EUC (or part thereof) into a safe state or maintain a safe state
in ISOs, safe failure as well regarded as multipoint failure and those not contribute into the violation of safety goal failure rate.
spurious operation
IECs considers spurious activations as safe failures, and that they result in the safe state of the UEC
N/A
dependent failurefailure whose probability cannot be expressed as the simple product of the unconditional
probabilities of the individual events that caused it

failures that are not statistically independent, i.e. the probability of the combined occurrence of the failures is not equal to the product of the probabilities of occurrence of all considered independent failures

*ISOs will pay much attention in this aspect

error

discrepancy between a computed, observed or measured value or condition and the true, specified or theoretically correct value or condition

*IEC and ISO are the same definition

discrepancy between a computed, observed or measured value or condition, and the true, specified or

theoretically correct value or condition

soft-error

erroneous changes to data content but no changes to the physical circuit itself

*most in the semiconductor

N/A


*describe in the ISOs-11

safe failiure fraction

if symplified into constant failure distribution


N/A
SPFM
N/A

LFM
N/A

PFDavg

mean unavailability of an E/E/PE safety-related system to perform the specified safety function when a demand occurs from the EUC or EUC control system

*for low demand system

N/A
PFHaverage frequency of a dangerous failure of an E/E/PE safety related system to perform the specified safety function over a given period of time
PMHF
N/A
PMHF in ISO 26262-5, 10
MTTR

expected time to achieve restoration

MTTR encompasses:
• the time to detect the failure (a); and,
• the time spent before starting the repair (b); and,
• the effective time to repair (c); and,
• the time before the component is put back into operation (d)

N/A    

EOTTI
N/A
specified time-span during which emergency operation can be maintained without an unreasonable level of risk


Forth: Hardware route introduction

1.R_1h

a) determine the safe failure fraction of all individual elements. In the case of redundant element configurations, the SFF may be calculated by taking into consideration theadditional diagnostics that may be available (e.g. by comparison of redundant elements);
b) determine the hardware fault tolerance of the subsystem;
c) determine the maximum safety integrity level that can be claimed for the subsystem if the elements are type A from Table 2;
d) determine the maximum safety integrity level that can be claimed for the subsystem if the elements are type B from Table 3.

Type A

Type B

Synthesis of System SIL capability determination

Manners of system SIL determination


2.R_2h

based on the  reliability field:

a) based on field feedback for elements in use in a similar application and environment; and,

b) based on data collected in accordance with international standards and,

c) evaluated according to:
i) the amount of field feedback; and,
ii) the exercise of expert judgement; and where needed,
iii) the undertaking of specific tests


Type A: confidence level >=90%

Type B: confidence level >=60%


  1. diagnosis test online like the DTTI or safety mechanisms in the ISO 26262

  2. Proof test are more than the verification or validation, but add up all the evidence self proof can be provided.

  3. IECs has the considerations of repair times for detected failures; MTTR, better than ISO 26262ED2, though ISO 26262ED2 defines the EOTTI, but actualy not clear enough

  4. Wonderful consideration of DTTI in IECs.


Fifth: Proven in use

ISO 26262s not define clearly enough for understanding, further listed the definitions:

An element shall only be regarded as proven in use when it has a clearly restricted and specified functionality and when there is adequate documentary evidence to demonstratethat the likelihood of any dangerous systematic faults is low enough that the required safety
integrity levels of the safety functions that use the element is achieved. Evidence shall be based on analysis of operational experience of a specific configuration of the element together with suitability analysis and testing.



[REF]

IECs

ISOs


收藏
点赞
2000