《网络计算及DPU在数据中心和边缘云上的应用.pdf》由会员分享,可在线阅读,更多相关《网络计算及DPU在数据中心和边缘云上的应用.pdf(28页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、1网络计算及DPU在数据中心和边缘云上的应用NVIDIA,Jan 8,2021计算无处不在,数据成为中心NETWORKEDGEAPPLIANCESUPERCOMPUTINGEXTREME IODATA ANALYTICSEDGE STREAMINGCLOUDSIMULATIONVISUALIZATIONAI网络计算成为数据中心的三大计算支柱之一数据中心、超算中心、边缘网络GPUCPUINFINIBAND网络软件定义硬件加速的网络计算Pre-configured EnginesProgrammable Engines网络计算加速互联网数据中心和HPC云INFINIBAND可编程可预配置光一样的速
2、度DPU coresProgrammable datapathData pre-processingUser-defined algorithmsSHARP(data reductions)MPI Tag-MatchingSelf Healing Network(resiliency)NVMe over fabricData security and tenant isolations200G end-to-end,extremely low latencyRDMA,GPUDirect RDMA,GPUDirect storageEnhanced Adaptive Routing and Co
3、ngestion ControlSmart topologies软件定义硬件加速的网络计算Pre-configured EnginesProgrammable Engines5InfiniBand DPU-构建云原生数据中心的基础高性能云原生数据中心的需求支持多租户服务类型可配置BARE-METAL 的性能安全可靠BLUEFIELD INFINIBAND DPU 集数据中心基础架构于芯片,面向高性能云原生数据中心StorageSPDKSecurityDPDKNetworkingDPDK/P4DOCA SDK开发包ASAP2CRYPTORoTRDMASNAPManagementTelemetry
4、INFRASTRUCTURE 应用用Infrastructure ManagementSoftware-defined StorageSoftware-defined SecuritySoftware-defined NetworkingBluefield DPU-面向高性能云原生数据中心HOSTManagement Isolation Security MonitoringApplicationsHPC/AI Communication FrameworksHPC/AI Storage File System ClientInfiniBand AdapterAcceleration Engi
5、nesInfiniBand SwitchAcceleration EnginesInfiniBand Bluefield DPUAcceleration EnginesInfiniBand SwitchAcceleration EnginesHOSTApplicationsManagement Isolation Security MonitoringHPC/AI Communication FrameworksHPC/AI Storage File System ClientMAGNUM IODOCACLOUD NATIVE 云原生高性能数据中心传统的超级计算中心/高性能数据中心Bluefi
6、eld DPU 兼顾用户数据安全和计算性能HOSTSecured Data Source and Data StorageHOSTHigh-PerformanceInfiniBandNetworkHOSTBLUEFIELD DPUApplicationsCPU/GPUNVMePCIeDOCA SECURITYSECURITY APPVIRTUAL PROTECTED ENCLAVE云的安全防御由外围转向服务器内部Software Defined Networking(SDN)Encryption(Software)L4-L7 InspectionWorkloadSoftware defined
7、Storage(SDS)WorkloadFirewall/Micro-segmentationNICOptionalDPUL4-L7 InspectionU-Segmentation NGFW/CryptoSDN&SDSIDSNGFWAnti-MalwareCore Data Center PerimeterWorkloadWorkloadIsolationIT OpsCloud ServerCloud ServerDevOpsWITHOUT DPUWITH DPUIT OpsDevOpsWorkloadWorkloadWorkloadWorkloadBluefield DPU-HPC 和 A
8、I 通信卸载21%Higher PerformanceEight servers,Dual Socket IntelXeon16-core CPUs E5-2697A V4 2.60 GHz(32 processes per node),NVIDIA BlueField-2 HDR100 DPUs and ConnectX-6 HDR100 adapters,NVIDIA Mellanox HDR Quantum Switch QM7800 40-Port 200Gb/s HDR InfiniBand,256GB DDR4 2400MHz RDIMMs memory and 1TB 7.2K
9、RPM SATA 2.5 hard drive per node.Courtesy of Ohio State University MVAPICH team and X-ScaleSolutionsBlueField DPU AcceleratedNot Accelerated BlueField DPU AcceleratedNot Accelerated HOSTHOSTHigh PerformanceInfiniBandNetworkHOSTBLUEFIELD DPUApplicationsCPU/GPUMAGNUM IOMPI HOSTBLUEFIELD DPUApplication
10、sCPU/GPUHigh PerformanceInfiniBandNetworkNetwork or Cluster FilesystemVirtualDOCA STORAGESTORAGE APPBluefield DPU-HPC&AI 高性能存储池化及卸载HOSTBLUEFIELD DPUFilesystem DeviceFS-SNAP ControllerFilesystem Drivera.txt/mnt/localdirb.txtc.txt/mnt/nfsrdmaa.txtb.txtc.txta.txtb.txtc.txtEmulates remote storage to app
11、ear as local to the host OSDynamically assigned storage,not bound by physical capacityInbox standard driversOS agnostic-supports legacy OSsApplicationsCPU/GPU13InfiniBand 网络会计算的SDN网络交换机计算的核心-SHARP技术 支持多个操作并发进行 支持应用:HPC(MPI/SHMEM)和分布式机器学习等 支持操作:Barrier,Reduce,All-Reduce,Broadcast and more 支持计算:Sum,Mi
12、n,Max,Min-loc,max-loc,OR,XOR,AND 支持数据:整型和16/32/64 bits 浮点数据DataAggregated AggregatedResultAggregated ResultDataSwitchSwitchSwitchHostHostHostHostHostScalable Hierarchical Aggregation and Reduction ProtocolSHARP AllReduce 性能提升 提供稳定的低延时,7倍的性能提升Performs the Gradient AveragingReplaces all physical param
13、eter serversAccelerate AI Performance交换机取代了AI训练的参数服务器传统方案:参数服务器上的 CPU 成为训练的瓶颈优化方案:交换机成为参数服务器网络拥塞成为数据中心的最大挑战之一NetworkNetwork网络中的In-Network拥塞交换机内的In-cast拥塞解决方案:动态路由解决方案:网络拥塞控制动态路由解决了网络In-Network拥塞问题MPIGraph:静态和动态路由对照静态路由动态路由HDR InfiniBand 拥塞控制解决了IN-CAST 拥塞问题12345678NodesHDR100SwitchHDR100SwitchHDR100S
14、witchABCDABCDABCDWith InfiniBand Congestion ControlWithout InfiniBand Congestion ControlHDR InfiniBand 拥塞控制解决了IN-CAST 拥塞问题12345678NodesHDR100SwitchABCDABCDABCDWith InfiniBand Congestion ControlWithout InfiniBand Congestion ControlHDR100SwitchHDR100SwitchInfiniBand网络总结 会计算的SDN网络,面向高性能云原生数据中心GPUCPUDPU
15、服务器三大核心计算单元之一面向以数据为中心的计算会计算的交换机,无限可扩展面向E级及更大规模数据中心集中管理,安全高效,天然SDN面向云原生数据中心标准、开放、向前向后兼容面向未来,兼容过去22开放以太网解锁软硬件,用户自定义开放以太网解锁软硬件、供应商数据中心以太网的演进Legacy MindsetWebscale MindsetProtocolsPIMHSRPLACPVPCOSFPv2RIPv2EIGRPSNMPTACACSUFDPVRST/MSTPPrivate VLANLoop/Root/BPDU GuardQOSVRRPVTPGVRPIGMPLACPTRILLSPBFabricPat
16、hVCSQfabricBGPFCoEBFDFEXOVSDB/VTEPMLAGQinQEVPNBGP/BFDSNMPRMONSPANCDPSNMPSPANERSPANsFlowIPFIXSYSLOGPacket BrokeringLLDPReal-time Visibility SnapshotsStreaming Telemetry(GPB)Packet BrokeringBuffer HistogramsMirror CongestionMirror DropsIn Band TelemetryIn-situ OAMRoCE TelemetryWatermarksSYSLOGERSPANSP
17、ANsFlowLLDPWJHNVIDIA 推动以太网走向开放Open Native Linux KernelOpen User Space ApplicationOpen SDKOpen ASIC DriverProgrammable ASICNVIDIA 开放以太网让用户灵活选择网络操作系统默认OSONYX如果使用VXLAN 建议Cumulus如果想开放、免费建议 SONiC/DENT适合小规模部署一键RoCE部署优异的VXLAN 性能基于Linux 网络操作系统用Linux的方法管理主机和网络开源以太网操作系统不必受限于网络厂商免费Packets 12 Tuple+Meta Data ve
18、ry detailed description开放以太网的健康保障 WJH(What Just Happened?)Telemetry 轻量 可部署 事件驱SDK/SAINetwork OS1.SDK generates:WJH messages2.Agent collects data:Streams to Database3.Presentation Layer:WHO is being impactedWHEN it happenedWHAT is causing the problemWHERE is the problem WHY it is happening Root Cause
19、+how to fix itThe Important QuestionsShows What Just Happened以太网的性能保障-RoCE 一键 RDMA部署CLI“RoCE”vs 26+commands in other NOS支持RDMA的最佳硬件设计低转发时延和优秀的共享缓存设计NEO网管软件端到端管理Lossless、Semi-Lossless、Lossy多种RDMA部署模式RDMA和TCP混合部署RoCE over VxLANFast ECNEth L2HeaderIP HeaderFCSIB BTH+(L4 Hdr)PayloadICRCEther TypePort#UDPHeaderPort#EtherType indicates packetis IP(i.e.,next header is IP)UDP dport number 4791 indicatesnext header is IB.BTHEth L2 Header may include 802.1Q tag with PCP and VLAN IDIP Header may include marking for ECNIP Header may include DSCPIp.protocol_numberindicates packet is UDP28li