Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ... arXiv preprint arXiv:2401.06066, 2024 | 144 | 2024 |
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Q Zhu, D Guo, Z Shao, D Yang, P Wang, R Xu, Y Wu, Y Li, H Gao, S Ma, ... arXiv preprint arXiv:2406.11931, 2024 | 106 | 2024 |
Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model A Liu, B Feng, B Wang, B Wang, B Liu, C Zhao, C Dengr, C Ruan, D Dai, ... arXiv preprint arXiv:2405.04434, 2024 | 105 | 2024 |
Deepseek llm: Scaling open-source language models with longtermism X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ... arXiv preprint arXiv:2401.02954, 2024 | 72 | 2024 |
Deepseek-v3 technical report A Liu, B Feng, B Xue, B Wang, B Wu, C Lu, C Zhao, C Deng, C Zhang, ... arXiv preprint arXiv:2412.19437, 2024 | 28 | 2024 |
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning D Guo, D Yang, H Zhang, J Song, R Zhang, R Xu, Q Zhu, S Ma, P Wang, ... arXiv preprint arXiv:2501.12948, 2025 | 21 | 2025 |
Auxiliary-loss-free load balancing strategy for mixture-of-experts L Wang, H Gao, C Zhao, X Sun, D Dai arXiv preprint arXiv:2408.15664, 2024 | 4 | 2024 |
Critique of Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility by SCC Team From Tsinghua University C Zhang, C Zhao, J He, S Chen, L Zheng, K Huang, W Han, J Zhai IEEE Transactions on Parallel and Distributed Systems 32 (11), 2631-2634, 2021 | 2 | 2021 |
Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning W An, X Bi, G Chen, S Chen, C Deng, H Ding, K Dong, Q Du, W Gao, ... SC24: International Conference for High Performance Computing, Networking
, 2024 | 1 | 2024 |
Canvas: End-to-End Kernel Architecture Search in Neural Networks C Zhao, G Zhang, M Gao arXiv preprint arXiv:2304.07741, 2023 | 1 | 2023 |
Student Cluster Competition 2018, Team Tsinghua University: Reproducing performance of multi-physics simulations of the Tsunamigenic 2004 Sumatra megathrust earthquake on the
J He, C Zhao, J Yu, X Yu, L Zheng, C Lou, S Tang, W Han, J Zhai Parallel Computing 90, 102570, 2019 | | 2019 |