Zhen ZHENG

Cituota

	Visi	Nuo 2019
Šaltiniai	526	512
h-rodyklė	12	12
i10-rodyklė	14	14

240

120

180

201720182019202020212022202320244 10 12 26 58 104 223 88

Viešas pasiekiamumas

Peržiūrėti viską

11 straipsnių

0 straipsnių

pasiekiami

nepasiekiami

Pagal finansavimo įpareigojimus

Bendraautoriai

Wei LinAlibabaPatvirtintas el. paštas alibaba-inc.com
Jun YangNVIDIAPatvirtintas el. paštas nvidia.com
Xipeng ShenProfessor of Computer Science, North Carolina State UniversityPatvirtintas el. paštas ncsu.edu
Jidong ZhaiTsinghua UniversityPatvirtintas el. paštas tsinghua.edu.cn
Youngmin YiUniversity of SeoulPatvirtintas el. paštas uos.ac.kr
Chuan WuProfessor of Computer Science, The University of Hong KongPatvirtintas el. paštas cs.hku.hk
Feng ZhangRenmin University of ChinaPatvirtintas el. paštas ruc.edu.cn
Shuaiwen Leon SongVice President, Together.ai; Ex-Microsoft; Tenured ProfessorPatvirtintas el. paštas together.ai

Stebėti

Zhen ZHENG

Microsoft

Patvirtintas el. paštas microsoft.com - Pagrindinis puslapis

Machine Learning System High Performance Computing Heterogeneous Computing


Pavadinimas Rūšiuoti pagal šaltinius Rūšiuoti pagal metus Rūšiuoti pagal pavadinimą	Cituota Cituota	Metai
DAPPLE: A pipelined data parallel approach for training large models S Fan, Y Rong, C Meng, Z Cao, S Wang, Z Zheng, C Wu, G Long, J Yang, ... Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021	155	2021
Understanding and bridging the gaps in current GNN performance optimizations K Huang, J Zhai, Z Zheng, Y Yi, X Shen Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021	69	2021
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer H Fu, J Liao, W Xue, L Wang, D Chen, L Gu, J Xu, N Ding, X Wang, C He, ... SC'16: Proceedings of the International Conference for High Performance …, 2016	41	2016
AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures Z Zheng, X Yang, P Zhao, G Long, K Zhu, F Zhu, W Zhao, X Liu, J Yang, ... Proceedings of the 27th ACM International Conference on Architectural …, 2022	38	2022
Versapipe: a versatile programming framework for pipelined computing on GPU Z Zheng, C Oh, J Zhai, X Shen, Y Yi, W Chen Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017	37	2017
Whale: Efficient giant model training over heterogeneous {GPUs} X Jia, L Jiang, A Wang, W Xiao, Z Shi, J Zhang, X Li, L Chen, Y Li, ... 2022 USENIX Annual Technical Conference (USENIX ATC 22), 673-688, 2022	30	2022
Fusionstitching: boosting memory intensive computations for deep learning workloads Z Zheng, P Zhao, G Long, F Zhu, K Zhu, W Zhao, L Diao, J Yang, W Lin arXiv preprint arXiv:2009.10924, 2020	28	2020
DISC: A dynamic shape compiler for machine learning workloads K Zhu, WY Zhao, Z Zheng, TY Guo, PZ Zhao, JJ Bai, J Yang, XY Liu, ... Proceedings of the 1st Workshop on Machine Learning and Systems, 89-95, 2021	22	2021
Optimizing distributed training deployment in heterogeneous GPU clusters X Yi, S Zhang, Z Luo, G Long, L Diao, C Wu, Z Zheng, J Yang, W Lin Proceedings of the 16th International Conference on emerging Networking …, 2020	20	2020
Drew: Efficient winograd cnn inference with deep reuse R Wu, F Zhang, J Guan, Z Zheng, X Du, X Shen Proceedings of the ACM Web Conference 2022, 1807-1816, 2022	14	2022
Flash-llm: Enabling cost-effective and highly-efficient large generative model inference with unstructured sparsity H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song arXiv preprint arXiv:2309.10285, 2023	13	2023
Gopipe: a granularity-oblivious programming framework for pipelined stencil executions on gpu C Oh, Z Zheng, X Shen, J Zhai, Y Yi Proceedings of the ACM International Conference on Parallel Architectures …, 2020	13	2020
Exploring deep reuse in winograd CNN inference R Wu, F Zhang, Z Zheng, X Du, X Shen Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021	11	2021
HiWayLib: A software framework for enabling high performance communications for heterogeneous pipeline computations Z Zheng, C Oh, J Zhai, X Shen, Y Yi, W Chen Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019	10	2019
Auto-map: A DQN framework for exploring distributed execution plans for DNN workloads S Wang, Y Rong, S Fan, Z Zheng, LS Diao, G Long, J Yang, X Liu, W Lin arXiv preprint arXiv:2007.04069, 2020	8	2020
Optimizing DNN compilation for distributed training with joint OP and tensor fusion X Yi, S Zhang, L Diao, C Wu, Z Zheng, S Fan, S Wang, J Yang, W Lin IEEE Transactions on Parallel and Distributed Systems 33 (12), 4694-4706, 2022	4	2022
Whale: Scaling deep learning model training to the trillions X Jia, AW Le Jiang, J Zhang, X Li, W Xiao, Y Li, Z Zheng, X Liu, W Lin arXiv preprint arXiv:2011.09208, 2020	4	2020
Bladedisc: Optimizing dynamic shape machine learning workloads via compiler approach Z Zheng, Z Pan, D Wang, K Zhu, W Zhao, T Guo, X Qiu, M Sun, J Bai, ... Proceedings of the ACM on Management of Data 1 (3), 1-29, 2023	3	2023
Expanding the Edge: Enabling Efficient Winograd CNN Inference With Deep Reuse on Edge Device F Zhang, R Wu, J Guan, Z Zheng, X Guo, X Zhang, X Du, X Shen IEEE Transactions on Knowledge and Data Engineering, 2023	2	2023
Auto-parallelizing large models with rhino: A systematic approach on production ai platform S Zhang, L Diao, S Wang, Z Cao, Y Gu, C Si, Z Shi, Z Zheng, C Wu, W Lin arXiv preprint arXiv:2302.08141, 2023	2	2023

Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.

Straipsniai 1–20

Šaltinių per metus

Dubliuoti šaltiniai

Sujungti šaltiniai

Pridėti bendraautoriusBendraautoriai

Stebėti

Cituota

Bendraautoriai