Feedbacks on SparseTIR project
Table of Contents
DONE Talk @ Google MLIR Team
A collection of useful questions:
- TACO can also decompose computations and schedule each of them, so what's the advantage of SparseTIR?
- Do SparseTIR support reorder dimensions (such as [1, 0] in MLIR Sparse Dialect)?
- CPU results, and 2:4 sparse tensor core support.
- Preprocessing overhead, how good is SparseTIR in the setting that sparse pattern is dynamic.
DONE Qualifying Project Presentation @ UW
Feedbacks from TQ:
Some notes on your talk (for future refs):
- Three stage compilation can be a bit too detailed, would be useful to have a "map"(outline) that you get back to
- Think about high-level punchlines (composability) and come back to it in your talk
- Try to increase your voice on punchlines(here is the key take-away! …) this helps to make your talk more animated.
- Good job bringing examples on transformations
- Avoid discussing technical details in exp and focus on result, show results, then mention what you did briefly
- Come back to your key takeaway (composability) in exps
At a high level: Think about your overall flow and try to always get audiences back to it Try to be animated and emphasize on punch-lines QA:
- try to repeat the question from audience(so you can answer, wrt to your interpretation, instead of try to guess what others ask)
Other questions:
- relations with Taichi-like frameworks.
DONE Talk @ Cornell Zhang Group
Feedbacks:
- Should explain more about the runtime/compile-time behavior about format decomposition.
- Both the presentation and paper is confusing about that.
- Release code soon so that other research groups could benefit from that.
DONE Send a followup email
DONE Sync with SparseLNR folks
Digest:
- SparseLNR extends TACO to support
rfactor
andcompute_at
primitives. - The original idea of SparseLNR comes from how to describe FusedMM in TACO.
- SparseLNR do not support vectorized intrinsics while SparseTIR can help support it.
- It's better to write some interactive demos about SparseTIR transformations.
DONE Talk @ Tsinghua
Digest:
- Look at segment group and atomic parallelism.
- Investigate the overhead of runtime load-balancing compared to compiled-time load-balancing.
- More clarification on inputs/outputs/format decomposition.
- Future work: summarize some common sparse patterns.
TODO Sync with Genghan
DONE ASPLOS Rebuttal
Digest:
- Experiments on DNN sparsity (sparse conv, etc.).
- Performance breakdown analysis.
DONE Talk @ Amazon AI
Digest:
- Questions from Minjie: tuning overhead.
TODO Feedbacks from Yinuo
They want HPC operators:
GAP
- Breadth-First Search
- PageRank
- Single-Source Shortest Path
- Connected Components
- Betweenness Centrality
- Triangle Counting
HPCC
- Gauss-Jordan Elimination
- RandomAccess
- Stentil
HPCG
- Spars Matrix-Vector Multiplication
- Sparse Matrix-Matrix Multiplication
- Sampled Dense Matrix Multiplication
- Sparse Triangular Solve
- Symmetric Gauss-Seidel Smoother
NAS
- Conjugate Gradient
- Integer Sort
TPC -Hash-Join
Should take some time to investigate them.