#Highlights2021 for me: our #survey on efficient processing of #sparse and compressed tensors of #ML/#DNN models on #hardware accelerators published in @ProceedingsIEEE.
Paper: dx.doi.org/10.1109/JPROC.…
arXiv: arxiv.org/abs/2007.00864
RT/sharing appreciated. 🧵
Context: Tensors of ML/DNN are compressed by leveraging #sparsity, #quantization, shape reduction. We summarize several such sources of sparsity & compression (§3). Sparsity is induced in structure while pruning & it is unstructured inherently for various applications or sources.
Likewise, leveraging value similarity or approximate operations could yield irregularity in processing. Also, techniques for size-reduction make tensors asymmetric-shaped. Hence, special mechanisms can be required for efficient processing of sparse and irregular computations.
Accelerators leverage #sparsity differently, improving only #memory #footprint, #energy efficiency, and/or #performance. Underlying mechanisms determine #accelerator’s ability of exploiting static/dynamic sparsity of single or multiple #tensors through #inference or #learning §4
Efficiently processing sparsity needs HW/SW mechanisms to store, extract, communicate, compute, load-balance only non-0s. For each, different solutions' efficacy vary across sparsity levels, patterns. Their analysis span across §5-11 + overall speedups for recent DNNs analyzed §4
#Survey discusses such mechanisms spanning across #circuits, #computerarchitecture, mapping, #DL model pruning; #accelerator-aware #DNNs model pruning; techniques for data extraction & #loadbalancing of effectual computations; #sparsity-aware dataflows; #compilers support.
Structured #sparsity, especially coarse-grain, can lead to simpler #hardware mechanisms and low #encoding overheads. We also analyzed how various sparsity and tensor shapes of #DNN operators impact data reuse and execution metrics.
#Accelerators employ #approximatecomputing by leveraging similarity of temporal and spatial data for #computervision and #NLP applications; #Reconfigurable mechanisms can enable processing a wide range of sparsity, precisions, tensor shapes. #FPGA
Trends & directions: jointly exploring compression techniques, #hardware-aware compression & #NAS/#AutoML, accelerator/model #codesign, coarse-grain structured sparsity, automating HW design modeling & implementation for compact models, #compiler support, accelerating training.
Survey also describes common techniques in accelerator design like for balancing compute w/ on/off-chip communication, approximate computing & advances such as reconfigurable NoCs, PEs for asymmetric or variable-precision processing. Please share it w/ whom you think can benefit.
Looking forward to a safe and productive 2022 for everyone. Best wishes for a happy 2022!
@SCAI_ASU @CompArchSA @PhDVoice @OpenAcademics @Underfox3 @ogawa_tter @jonmasters @ASUEngineering @AcademicChatter #PhDgenie @PhDForum @hapyresearchers
requesting possibly amplifying visibility for extensive literature review of recent technology—thanks! #ML #hardware #tinyML
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.