Shail Dave Profile picture
Final-year PhD candidate @SCAI_ASU. Sustainable & Agile Accelerator Design Automation; CompArch; CompilerOPT; Dataflow; Sparsity; MLSys. @SRCOrg AIHW Scholar

Dec 31, 2021, 12 tweets

#Highlights2021 for me: our #survey on efficient processing of #sparse and compressed tensors of #ML/#DNN models on #hardware accelerators published in @ProceedingsIEEE.
Paper: dx.doi.org/10.1109/JPROC.…
arXiv: arxiv.org/abs/2007.00864
RT/sharing appreciated. 🧵

Context: Tensors of ML/DNN are compressed by leveraging #sparsity, #quantization, shape reduction. We summarize several such sources of sparsity & compression (§3). Sparsity is induced in structure while pruning & it is unstructured inherently for various applications or sources.

Likewise, leveraging value similarity or approximate operations could yield irregularity in processing. Also, techniques for size-reduction make tensors asymmetric-shaped. Hence, special mechanisms can be required for efficient processing of sparse and irregular computations.

Accelerators leverage #sparsity differently, improving only #memory #footprint, #energy efficiency, and/or #performance. Underlying mechanisms determine #accelerator’s ability of exploiting static/dynamic sparsity of single or multiple #tensors through #inference or #learning §4

Efficiently processing sparsity needs HW/SW mechanisms to store, extract, communicate, compute, load-balance only non-0s. For each, different solutions' efficacy vary across sparsity levels, patterns. Their analysis span across §5-11 + overall speedups for recent DNNs analyzed §4

#Survey discusses such mechanisms spanning across #circuits, #computerarchitecture, mapping, #DL model pruning; #accelerator-aware #DNNs model pruning; techniques for data extraction & #loadbalancing of effectual computations; #sparsity-aware dataflows; #compilers support.

Structured #sparsity, especially coarse-grain, can lead to simpler #hardware mechanisms and low #encoding overheads. We also analyzed how various sparsity and tensor shapes of #DNN operators impact data reuse and execution metrics.

#Accelerators employ #approximatecomputing by leveraging similarity of temporal and spatial data for #computervision and #NLP applications; #Reconfigurable mechanisms can enable processing a wide range of sparsity, precisions, tensor shapes. #FPGA

Trends & directions: jointly exploring compression techniques, #hardware-aware compression & #NAS/#AutoML, accelerator/model #codesign, coarse-grain structured sparsity, automating HW design modeling & implementation for compact models, #compiler support, accelerating training.

Survey also describes common techniques in accelerator design like for balancing compute w/ on/off-chip communication, approximate computing & advances such as reconfigurable NoCs, PEs for asymmetric or variable-precision processing. Please share it w/ whom you think can benefit.

Looking forward to a safe and productive 2022 for everyone. Best wishes for a happy 2022!

@SCAI_ASU @CompArchSA @PhDVoice @OpenAcademics @Underfox3 @ogawa_tter @jonmasters @ASUEngineering @AcademicChatter #PhDgenie @PhDForum @hapyresearchers
requesting possibly amplifying visibility for extensive literature review of recent technology—thanks! #ML #hardware #tinyML

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling