【NVIDIA】ARCHITECTED FOR SAFETY :安全架构浅谈

来源:公众号“汽车功能安全”
2020-09-28
1877

NVIDIA designs the DRIVE AGX platform to ensure that the autonomous vehicle can operate safely within the operational design domain for which it is intended. In situations where the vehicle is outside its defined operational design domain or conditions dynamically change to fall outside it, our products enable the vehicle to return to a minimal risk condition (also known as a safe fallback state). For example, if an automated system detects a sudden change such as a heavy rainfall that affects the sensors and, therefore, the driving capability within its operational design domain, the system is designed to hand off control to the driver. If significant danger is detected, the system is designed to come to a safe stop. 

NVIDIA follows the V-model (including verification and validation) at every stage of DRIVE development. We also perform detailed analyses of our products’ functionality and related hazards to develop safety goals for the product. For each identified hazard, we create safety goals to mitigate risk, each rated with an Automotive Safety Integrity Level (ASIL). ASIL levels of A, B, C, or D indicate the level of risk mitigation needed, with ASIL D representingthe safest (the highest level of risk reduction). Meeting these safety goals is the top-level requirement for our design. By applying the safety goals to a functional design description, we create more detailed functional safety requirements.

以下翻译仅供参考(有出入以英文为主):


英伟达设计的驱动AGX平台,以确保自动驾驶汽车可以

在预期的操作设计范围内安全运行。

在车辆超出其定义的操作设计范围的情况下

情况动态变化,落外它,我们的产品使车辆

返回到最低风险的状态(也称为安全退保状态)。例如,

如果一个自动化系统检测到一个突然的变化,如暴雨影响

传感器,因此,驾驶能力在其操作设计

域中,系统被设计为将控制权交给驱动程序。如果重要的危险

被检测到时,系统被设计成安全停止。

NVIDIA在驱动的每个阶段都采用v模式(包括验证和确认)

发展。我们还会对产品的功能和相关功能进行详细的分析

制定产品安全目标的危害。对于每个确定的危险,我们创建

降低风险的安全目标,每个目标都有一个汽车安全完整性等级(ASIL)。

ASIL级别A、B、C或D表示所需的风险缓解级别,使用ASIL D

代表最安全的(降低风险的最高水平)。达到这些安全目标

是我们设计的顶层要求。通过将安全目标应用于功能

设计说明,我们创建更详细的功能安全要求。


At the system-development level, we refine the safety design by applying the functional safety requirements to a specific system architecture. Technical analyses –such as failure mode and effects analysis (FMEA), fault tree analysis (FTA), and dependent failure analysis (DFA)–are applied iteratively to identify weak points and improve the design. Resulting technical safety requirements are delivered to the hardware and software teams for development at the next level. We’ve also designed redundant and diverse functionality into our autonomous vehicle system to make it as resilient as possible. This ensures that the vehicle will continue to operate safely when a fault is detected or reconfigure itself to compensate for a fault. 


At the hardware-development level, we refine the overall design by applying technical safety requirements to the hardware designs of the board and the chip (SoC or GPU). Technical analyses are used to identify any weak points and improve the hardware design. Analysis of the final hardware design is used to verify that hardware failure related risks are sufficiently mitigated. 


At the software-development level, we consider both software and firmware. We refine the overall design by applying technical safety requirements to the software architecture. We also perform code inspection, reviews, automated code structural testing, and code functional testing at both unit and integration levels. Software-specific failure mode and effects analysis are used to design better software. In addition, we design test cases for interface, requirements-based, fault injection, and resource usage validation methods. 


When we have all necessary hardware and software components complete, we integrate and start our verification and validation processes on the system level. 


In addition to the autonomous vehicle simulation described under Simulation, we also perform end-to-end system testing and validation.


以下翻译仅供参考(有出入以英文为主):


在系统开发级别,我们通过应用功能来优化安全设计

特定系统架构的安全要求。技术分析

故障模式与影响分析(FMEA)、故障树分析(FTA)和相关故障分析

分析(DFA)——迭代地应用于识别弱点和改进设计。

由此产生的技术安全需求被交付给硬件和软件团队

为了下一阶段的发展。我们还设计了冗余和多样化的功能

进入我们的自动驾驶汽车系统,使它尽可能有弹性。这将确保

当检测到故障或重新配置时,车辆将继续安全运行

用来补偿错误的。

在硬件开发层面,我们通过应用技术来完善整体设计

主板和芯片(SoC或GPU)硬件设计的安全性要求。

技术分析是用来发现任何弱点和改进硬件设计。

最后通过硬件设计分析来验证硬件故障的相关风险

足够减轻。

在软件开发级别,我们同时考虑软件和固件。我们改进

通过应用技术安全要求对软件体系结构进行总体设计。

我们还执行代码检查、评审、自动代码结构测试和代码

单元和集成级别的功能测试。Software-specific失效模式

并通过效果分析设计出更好的软件。此外,我们设计测试用例

用于接口、基于需求、故障注入和资源使用验证方法。

当我们完成了所有必要的硬件和软件组件,

我们在系统级集成并开始验证和验证过程。

除了仿真中描述的自动驾驶汽车的仿真,我们还

执行端到端系统测试和验证。



  • ALL IN ONE: AI TRAINING, SIMULATION, AND TESTING

NVIDIA’s infrastructure platform enables the training, simulating, and testing of autonomous driving applications. This includes a data factory to label millions of images, NVIDIA DGX SaturnV supercomputer for training DNNs, DRIVE Constellation for hardware-in-the-loop simulation, and other tools to complete our end-to-end system. 

Autonomous vehicle software development begins with collecting huge amounts of data from vehicles in globally diverse environments and situations. Multiple teams across many geographies access this data for labeling, indexing, archiving, and management before it can be used for AI model training and validation. We call this first step of the autonomous vehicle workflow the “data factory.” 

AI model training starts when the labeled data is used to train them for perception and other self-driving functions. This is an iterative process; the initial models are used by the data factory to select the next set of data to be labeled. Deep learning engineers adjust model parameters as needed, and then re-train the DNN, at which point the next set of labeled data is added to the training set. This process continues until the desired model performance and accuracy is achieved. 

Self-driving technology must be evaluated again and again during development in a vast array of driving conditions to ensure that the vehicles are far safer than human-driven vehicles. Simulation runs test-drive scenarios in a virtual world, providing rendered sensor data to the driving stack and carrying out driving commands from the driving stack. Re-simulation plays back previously recorded sensor data to the driving stack. The AI model is finally validated against a large and growing collection of test data. 

  • The NVIDIA Solution – NHTSA Safety Element 


        OBJECT AND EVENT DETECTION AND RESPONSE 

Object and event detection and response refer to the detection of any circumstance that’s relevant to the immediate driving task, and the appropriate driver or system response to this circumstance. The NVIDIA DRIVE AV module is responsible for detecting and responding to environmental stimuli, both on and off the road. The NVIDIA DRIVE IX module helps monitor the driver and take mitigation actions when they’re required.



收藏
点赞
2000