Data Fabric Solutions for Real Time Analytics and Large Scale AI Applications

High Performance Object Storage(HPOS)

High Performance Object Storage

High Performance Object Storage (HPOS) stack enables disaggregated storage over multiple SmartSSDs (Samsung’s computational storage drives). A data centric solution abstracting which computational function runs in which accelerator layer (i.e. CPU, GPU, XPU or SmartSSD) with ease of scale. Samsung’s HPOS solution provides reference solution for Video AI applications requiring in-storage video pre-processing. And for Data Analytics use case providing high performance for unstructured/object data in data lake.

HPOS Architecture

Use Cases

• Data intensive real time anlytics and AI
• Surveillance, Smart City, near real-time streaming 
• Data Center & IIoT Cyber Security
• Accelerate Data Lake Queries

Benefits

• Low-latency ML preprocessing
• Better Throughput per watt
• Less Network traffic
• Lower TCO due to reduced CPU and network traffic

State of the Art(Baseline) vs HPOS

Play Video

Disaggregated Storage Solution (DSS)

Disaggregated Storage solution (DSS)

Samsung has developed DSS, a rack-scalable, very high read-bandwidth-optimized, Amazon S3-compatible object storage solution.
It utilizes a disaggregated architecture, enabling independent scaling of storage and compute. DSS is designed to make the most of system-level design, and Samsung’s best SSDs, to get the maximum performance while minimizing OPEX costs.

Use Cases

• Large-scale, high-throughput training
• Image Analytics
• Audio/Video AI
• Metaverse

Benefits

• High throughput storage access-     3x better than NFS 
• 20% better than GPU accelerated Leading Enterprise Filesystem
• True Scalable solution
• Lower TCO with less storage nodes serving more clients

Play Video

DSS Blog

Open Source

Publications

Posted whitepaper in 2023 MemCon

Increasingly, as AI technology evolves into more sophisticated applications, training dataset sizes continue to grow exponentially. In order to scale storage and network infrastructure commensurately to deliver the data required and avoid unbearable training cycle times, there is a need for a new storage concept and innovative solution to address this technology gap. DSS Gen2 and beyond provides a potential next generation architecture and solution to alleviate this bottleneck.
Another issue that is major impediment to data center scaling is power utilization. And since one of the key storage consumer is data storage, DSS technology not only plan to optimize server power utilization but also partner and leverage Samsung’s new high capacity SSDs which has one of the highest density in the world.

Full Whitepaper

2022 OCP Global Summit

With the advent of new application workloads related to Big Data including AI/ML, IoT, Video, security and many other machine generated data, there is a strong incentive for companies as well as governments to mine this treasure trove of data to extract value.   Innovative companies taking on these storage challenges have tried storing Big Data into data lakes and moving them into locally attached storage for analysis but due to the sheer size of the data set this turns out to be too time consuming.   Traditional network data storage also have been explored but overcoming inherent scaling and performance bottlenecks are difficult.   DSS provides an innovative solution to the problem by designing purpose built storage that only targets these specific workloads.

Video

Slide

2021 OCP Global Summit

High Performance and Hardware Acceleration – With exponential data generation rate, specifically in applications like Deep learning, AI the demand for storage with high-bandwidth and great scalability that supports unstructured data format is increasing. To fulfill this need Samsung proposes DSS storage solution, which implements object Key-Value API on top NVMeOF SSD. The support of storage remote access protocols facilitates the disaggregation. Therefore, storage can be easily scaled. Besides object storage support and scalability, our architecture can provision the bandwidth demands for each application on each client server. This paper introduces our DSS Storage systems that support high-bandwidth per capacity for object-format data with an effortlessly scale-up feature. DSS uses some methods to deterministically provide bandwidth to the client sessions to mitigate the contention and starvation. Therefore, our storage design is essential for large concurrent multi-session workloads with intensive reads such as AI training.

Video

Slide

Would like to get connected for this project?

Memory Solutions Lab

Transforming Memory Performance

Go back to Main Page

Contact Form

Do you have question to Samsung MSL?

Once the form is submitted, MSL will contact you with details