Feedbacks on SparseTIR project

DONE Talk @ Google MLIR Team
DONE Qualifying Project Presentation @ UW
DONE Talk @ Cornell Zhang Group
- DONE Send a followup email
DONE Sync with SparseLNR folks
DONE Talk @ Tsinghua
- TODO Sync with Genghan
DONE ASPLOS Rebuttal
DONE Talk @ Amazon AI
TODO Feedbacks from Yinuo

DONE Talk @ Google MLIR Team

A collection of useful questions:

TACO can also decompose computations and schedule each of them, so what's the advantage of SparseTIR?
Do SparseTIR support reorder dimensions (such as [1, 0] in MLIR Sparse Dialect)?
CPU results, and 2:4 sparse tensor core support.
Preprocessing overhead, how good is SparseTIR in the setting that sparse pattern is dynamic.

DONE Qualifying Project Presentation @ UW

Feedbacks from TQ:

Some notes on your talk (for future refs):

Three stage compilation can be a bit too detailed, would be useful to have a "map"(outline) that you get back to

Think about high-level punchlines (composability) and come back to it in your talk

Try to increase your voice on punchlines(here is the key take-away! …) this helps to make your talk more animated.

Good job bringing examples on transformations

Avoid discussing technical details in exp and focus on result, show results, then mention what you did briefly

Come back to your key takeaway (composability) in exps

At a high level: Think about your overall flow and try to always get audiences back to it Try to be animated and emphasize on punch-lines QA:

try to repeat the question from audience(so you can answer, wrt to your interpretation, instead of try to guess what others ask)

DONE Talk @ Cornell Zhang Group

Feedbacks:

Should explain more about the runtime/compile-time behavior about format decomposition.
- Both the presentation and paper is confusing about that.
Release code soon so that other research groups could benefit from that.

DONE Send a followup email

DONE Sync with SparseLNR folks

Digest:

SparseLNR extends TACO to support rfactor and compute_at primitives.
The original idea of SparseLNR comes from how to describe FusedMM in TACO.
- SparseLNR do not support vectorized intrinsics while SparseTIR can help support it.
It's better to write some interactive demos about SparseTIR transformations.

DONE Talk @ Tsinghua

Digest:

Look at segment group and atomic parallelism.
Investigate the overhead of runtime load-balancing compared to compiled-time load-balancing.
More clarification on inputs/outputs/format decomposition.
Future work: summarize some common sparse patterns.

TODO Sync with Genghan

DONE ASPLOS Rebuttal

Digest:

Experiments on DNN sparsity (sparse conv, etc.).
Performance breakdown analysis.

DONE Talk @ Amazon AI

Digest:

Questions from Minjie: tuning overhead.

TODO Feedbacks from Yinuo

They want HPC operators:

GAP

Breadth-First Search
PageRank
Single-Source Shortest Path
Connected Components
Betweenness Centrality
Triangle Counting

HPCC

Gauss-Jordan Elimination
RandomAccess
Stentil

HPCG

Spars Matrix-Vector Multiplication
Sparse Matrix-Matrix Multiplication
Sampled Dense Matrix Multiplication
Sparse Triangular Solve
Symmetric Gauss-Seidel Smoother

NAS

Conjugate Gradient
Integer Sort

TPC -Hash-Join

Should take some time to investigate them.

Feedbacks on SparseTIR project

Table of Contents

DONE Talk @ Google MLIR Team

DONE Qualifying Project Presentation @ UW

DONE Talk @ Cornell Zhang Group

DONE Send a followup email

DONE Sync with SparseLNR folks

DONE Talk @ Tsinghua

TODO Sync with Genghan

DONE ASPLOS Rebuttal

DONE Talk @ Amazon AI

TODO Feedbacks from Yinuo