The amount of data produced in every minute makes it challenging to store, manage, utilize, and analyze it faster. Companies like Intel are building hardware accelerators to boost software application performance significantly. However, not many big data applications are really known them or optimizing to use it. We created an Opensource DARS (Data Analytics Reference Stack) (
https://clearlinux.org/stacks/data-analytics-stack-v1) based on open source Clear Linux, and well utilizes the hardware accelerated libraries in the big data service stack.
The DARS comprises of Apache Spark, Apache Hadoop, OpenJDK and utilizes both MKL, OpenBLAS math libraries all built on ClearLinux and complete stack includes ClearLinux OS along with containerized environment images such as Docker images. This end-end optimized stack delivers up to 8x performance gain in machine learning workloads. We will also cover how to use the optimized stack, replicate on your production environments and present the performance results.