current position:Home>Suiyuan technology appeared at the hot chips conference to explain the deep chip architecture in detail

Suiyuan technology appeared at the hot chips conference to explain the deep chip architecture in detail

2021-08-31 06:08:35 TechWeb

【TechWeb】8 month 25 Daily news , Today, Suiyuan technology is in the annual Hot Chips At the conference, chief architect Liu Yan and senior chip design director Feng Chuang introduced the first generation cloud training chip “ Think deeply 1.0” Architecture details of .Hot Chips It is one of the important conferences related to high-performance microprocessors and integrated circuits in the world , Chip industry giants take this opportunity to show their latest achievements every year , Including processor architecture , Infrastructure computing platform , Memory processing and other technologies .

The first generation of general artificial intelligence training chip of Suiyuan technology “ Think deeply 1.0” Package diagram

Think deeply 1.0 It's Suiyuan technology 2019 year 12 The first generation cloud released in May AI Training chip , Adopt multi-core structure , Its calculation core adopts the self-developed... Of Suiyuan technology GCU-CARE Calculation engine . Whole SOC Have 32 individual GCU-CARE Calculation engine , form 4 A computing group , Comprehensive support for common AI Tensor data format (FP32/FP16/BF16, INT8/INT16/INT32), More comprehensive support for customer business .CARE It is also innovative to reuse the tensor core , Scalar efficiency is provided with more efficient transistors 、 vector 、 Tensor and the computing power of various data accuracy .

CU-DARE The data architecture , Data flow oriented optimization , Processing in data flow .512GB/s Of HBM and 200GB/s Of GCU-LARE interconnection , Several times more than traditional GPU、CPU; Robust Distributed on-chip shared cache , Provide 10TB/s Very large bandwidth ; Programmable shared cache , Controllable thread 、 Data resident sharing between threads , Eliminate unnecessary IO visit , It not only reduces the data access delay , And save valuable IO bandwidth ; meanwhile ,DARE The architecture also provides an asynchronous data loading interface , Support pipeline execution of data and operation , Improve the parallelism of operations .

Four way GCU-LARE Intelligent interconnection ,200GB/s High speed low delay inter chip interconnection interface , Flexible support for computing needs of different scales , It can support kcal scale clusters , Provide artificial intelligence training product portfolio based on different needs for large, medium and small data centers .

“ Think deeply 1.0”SOC

Think deeply 1.0 The AI acceleration chip is designed for cloud training scenarios , Support CNN、RNN、LSTM、BERT And so on , Can be used for images 、 Stream data 、 Voice training scenarios . Adopted standards PCIe 4.0 Interface , Widely compatible with mainstream AI The server , It can meet the needs of large-scale deployment of Data Center , And the energy efficiency ratio is leading .

The last part of the speech , Liu Yan also introduced what was just released at the world artificial intelligence conference last month “ Think deeply 2.0” Training chip . After a new upgrade iteration , Think deeply 2.0 Computing power 、 Storage and bandwidth 、 Compared with the first generation of training products, the Internet ability has been greatly improved , The ability to support large-scale models has been significantly enhanced . thus , Chert has become the first company in China to release the second generation artificial intelligence training product portfolio .

Think deeply 2.0 Large scale architecture upgrade , Deep optimization for the characteristics of Artificial Intelligence Computing , Consolidate the foundation of supporting general heterogeneous computing ; Support comprehensive calculation accuracy , Covering from FP32、TF32、FP16、BF16 To INT8, Single precision FP32 The peak computing power reaches 40TFLOPS, Single precision tensor TF32 The peak computing power reaches 160TFLOPS. At the same time 4 star HBM2E On chip memory chips , High configuration support 64GB Memory , Bandwidth up to 1.8TB/s.GCU-LARE Also fully upgraded , Provide two-way 300GB/s Internet bandwidth , Support thousands of Zhang yunsui CloudBlazer Speed up card interconnection , Achieve excellent linear speedup .

The second generation general AI training chip of Suiyuan technology “ Think deeply 2.0”

And the synchronous upgrade of the control calculation TopsRider software platform , Become the cornerstone of Suiyuan technology to build the original innovation software ecology . Through hardware and software co architecture design , Give full play to deep thinking 2.0 Performance of ; Based on operator generalization technology and graph optimization strategy , Support all kinds of model training under the mainstream deep learning framework ; utilize Horovod Distributed training framework and GCU-LARE Interconnection technologies work together , To provide solutions for the efficient operation of large-scale clusters . Open and upgraded programming model and extensible operator interface , It provides custom development capability for the optimization of customer model .

It is reported that Suiyuan technology focuses on cloud computing platform in the field of artificial intelligence , Committed to providing inclusive infrastructure solutions for the development of artificial intelligence industry , Provide high computing power with independent intellectual property rights 、 High energy efficiency ratio 、 Programmable general artificial intelligence training and reasoning products . Its innovative architecture 、 Interconnection scheme and distributed computing and programming platform , Can be widely used in Cloud Data Center 、 Supercomputing Center 、 Internet 、 Many AI scenarios such as finance and smart city .

The enterprise check message shows , Suiyuan technology has previously obtained several rounds of financing .2021 year 1 month 5 Risuiyuan technology announced the completion of C Round of funding 18 RMB 100 million , By CITIC Industrial Fund 、 Fund of CICC capital 、 Chunhua capital leads the investment , tencent 、 Wu Yuefeng capital 、 Many new and old shareholders such as red dot Venture Capital China fund follow the investment .

copyright notice
author[TechWeb],Please bring the original link to reprint, thank you.