Experience

 
 
 
 
 
June 2020 – November 2020
San Jose

Research Intern

NVIDIA

 
 
 
 
 
June 2019 – September 2019
Seattle

Research Intern

Tencent AI Lab

 
 
 
 
 
May 2018 – August 2019
Chengdu

Research Intern

Megvii Inc.

Recent Publications

*: Equal Contribution

(2020). Field-Configurable Multi-resolution Inference: Rethinking Quantization. The 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021).

PDF

(2020). exBERT: Extending Pre-trained Models with Domain-specific Vocabulary Under Constrained Training Resources. Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020).

PDF Code

(2019). Additive Powers-of-Two Quantization: A Non-uniform Discretization for Neural Networks. The 8th International Conference on Learning Representations (ICLR 2020).

PDF Code

(2019). RTN: Reparameterized Ternary Network. The 34th AAAI Conference on Artificial Intelligence (AAAI 2020).

PDF

(2019). Full-stack Optimization for Accelerating CNNs with FPGA Validation. The 33rd ACM International Conference on Supercomputing (ICS 2019).

PDF

(2019). Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays. The 30th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2019).

PDF

(2017). Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon. The 31-st Annual Conference on Neural Information Processing Systems (NeurIPS 2017).

PDF Code Slides

Service

Jan 2019

Reviewer

ICLR (2020), CVPR (2019, 2020, 2021), Neurocomputing, NeurIPS (2019), ICML (2019)