Inspur HF18000G5 high-end all-flash storage is designed for medium/large-sized enterprises and is oriented to structured and unstructured.
LEARN MORELearn what's at the heart of new hybrid cloud platforms.Extend the cloud experience across your business in away that‘s open flexible, and hybrid by design.
Read the INSPUR paperNovember 18-21 | Colorado Convention Centre, Denver, USA
Thinking about attending SC19?Register to get a free pass and join Inspur at the conference for all things HPC.New specification provides current system compatibility and a framework for mixed accelerator hardware applications
Alan Chang, Vice President of Technical Operations at Inspur Systems
Today, one need only watch a key sporting event or popular television show to be inundated with numerous commercials touting the benefits and future potential that artificial intelligence (AI) holds for humanity. Applications that could hardly be envisioned only a short time ago are now becoming commonplace, and the outlook for the future is that the AI field will continue to grow by leaps and bounds. That said, achieving the promise of AI requires building computing platforms that can deliver high-performance, robustness and scalability while embracing openness to overcome interoperability challenge and set the stage to respond more quickly and cost-effectively to market demands.
Helping to assure interoperability and aid manufactures in meeting the growing demand for AI systems with enhanced capabilities, the Open Compute Project (OCP) engages numerous partners committed to advancing AI computing technology through the utilization of open specifications—its latest project referred to as Open Accelerator Infrastructure (OAI). Drawing on past experience with previous open hardware and software projects, the organization draws participants from all areas of the computing ecosystem, with most recent efforts successfully focused on advancing accelerator technologies to offer more elegant, streamlined and accessible open specifications for advancing AI computing platforms
A recent roundtable discussion with leaders from OCP and Baidu involved an in-depth exploration of the development and value proposition of OAI that reached some interesting conclusions.
According to Archna Haylock, Community Director, Open Compute Foundation, “Companies today are facing numerous challenges, whether it comes to data center infrastructure, hardware acceleration, or hardware management from the facilities to the rack down to the nodes. What OCP brings to the table is an environment of collaboration to meet these challenges and find a common solution that works across the board and that provides economies of skill to achieve improved efficiencies and cost savings.”
Clearly, a key objective for OAI participants, Baidu and Inspur amongst them, was to simplify the design of the accelerator module. The specification resulting from these efforts is in of itself a technical solution, whereby manufacturers can design their own products based on the OAI specification without having to start from scratch. Much as with other open source software, such as Hadoop, GFS, Linux and others, users can download the code directly and pursue individual development efforts.
In effect, the specification promotes the convergence of different accelerator technologies, such as ASIC, GPU, and FPGA, overcoming incompatibility issues and enabling these technologies to perform under unified hardware standards. In this way, users can replace different chips freely, bringing more options to manufacturers and simplifying the supply side of the accelerator industry. The key technological advantages of OAI are:
One of the first product offering to benefit from the specification’s development is the Baidu X-MAN 4.0 - a jointly developed system with Inspur, which is the latest AI computing systems from a company that continue to be a leading proponent of openness in its product development. The evolution of the OAI specification started with the OpenAPI model specification, with contributions from Facebook, Microsoft and Baidu. From that point, it became clear that there was as a desire to expand the specification to an infrastructure where the whole rack and system could perform with increased interoperability. Working under the auspices of OCP, the OAI subgroup focused on how best to support diversified accelerators. As a result, manufactures are provided greater choice in an open ecosystem that will ultimately bring notable benefits to developers and end users of AI applications.
Richard Ding, AI System Architect from Baidu, also commented: “OCP is a very good platform for the people and users and system integrators, as well as chip providers to work on one stage. For Baidu, OCP was the platform where we could better identify our requirements, discover how we could work together with our partners, even sometimes our competitors, and define a kind of standard that can benefit the entire ecosystem. Overall it was positive experience resulting the development of our latest full-rack AI computing product, X-MAN 4.0.”
The scope of OAI subgroup’s work included defining the physical modules that include logical aspects such as electrical, mechanical, thermal, management, hardware security, physical serviceability, etc. to produce solutions compatible with traditional existing operation systems and allowing for the creation of frameworks for running heterogeneous accelerator applications. Moving forward there is growing industry consensus that by encouraging the specification’s adoption, and further practical application testing, that ongoing advancements in the AI ecosystem can be achieved through standardization.
Conclusion:
The OAI project is built around the concept of designing a modular architecture that can support different accelerators and multi-system scaling-up interconnecting communication very easily. The task ahead is to promote its application and garner increased support from industry in order to achieve scale both across the high-performance computing ecosystem, as well as vertical markets. As the standard becomes of more practical significance, its actual application can test the advantages and disadvantages of the specifications so that the standard’s technology can be upgraded to meet the needs of real-world computing scenarios based upon AI applications. Inspur is committed to the continued advancement of the OAI standard’s scalability and to supporting its broader market adoption.
Inspur AIStation Empowers Efficient GPU Resource Sharing
AIStation is an Inspur-developed AI development platform specifically designed to deal with these issues by offering an easy to set up and refined GPU resource scheduling system.
Inspur,AI
Inspur joined with Xishuangbanna National Nature Reserve to develop an extensive technology system for the conservation of some 300 Asian elephants in Yunnan, China.
Inspur,AI,
Uncovering the ancient past with Inspur AI and biomolecular archaeology
Inspur teams up with DNA lab to trace the origin of human civilization
Inspur,AI,
Archaeology and AI Unlock the Secrets of Our Ancient History
With the help of today’s intelligent computing, researchers are now more easily able to find out more about our world from critically examining the artifacts of the past.
Inspur,Server,
By Arthur Kang, SPEC OSSC Member / OSG ML Chair, Performance Architect, Inspur Information
Inspur,Server
Training Yuan 1.0 – a Massive Chinese Language Model with 245.7 Billion Parameters
The advanced language capabilities of Yuan 1.0 necessitates a high number of parameters, which brings many challenges in model training and deployment. This article focuses on the computing challenges of Yuan 1.0 and the training methods used.
AI, Yuan
Performance Evaluation of Weather and Climate Prediction Applications on Intel's Ice Lake Processor
The amazing enhancements of Intel's new third-generation Xeon Scalable processors (Ice Lake),
AI,HPC
A Deep Analysis on Optimal Single Server Performance of ResNet50 in MLPerf Training Benchmarks
lots of factors can influence the training performance. In this article, we use the ResNet50 model from MLPerf Training v1.0 benchmarks as an example to describe how to improve training speed with hardware and software optimization
Server
Integrates hardware and software that helps data centers improve construction efficiency, simplifies operations management, and enhances operational efficiency.
Open
Standardization for Advancing Heterogeneous AI Computing Platforms
New specification provides current system compatibility and a framework for mixed accelerator hardware applications
AI
The Future of Data Centers Is Greener, More Open Computing
As basic IT infrastructure shifts from the on-prem private IT datacenter to the public/hybrid cloud with the growing demand for more computing performance, the world is facing a new challenge
Data
The Evolution of the Rack-scale System in the 5G Era
As the world enters the 5G era, the common impression is that 5G will bring down consumer prices of information services.
Open