Swin transformer: Hierarchical vision transformer using shifted windows Z Liu, Y Lin, Y Cao, H Hu, Y Wei, Z Zhang, S Lin, B Guo Proceedings of the IEEE/CVF international conference on computer vision
, 2021 | 22150 | 2021 |
Video swin transformer Z Liu, J Ning, Y Cao, Y Wei, Z Zhang, S Lin, H Hu Proceedings of the IEEE/CVF conference on computer vision and pattern
, 2022 | 1711 | 2022 |
Swin transformer v2: Scaling up capacity and resolution Z Liu, H Hu, Y Lin, Z Yao, Z Xie, Y Wei, J Ning, Y Cao, Z Zhang, L Dong, ... Proceedings of the IEEE/CVF conference on computer vision and pattern
, 2022 | 1708 | 2022 |
Group-free 3d object detection via transformers Z Liu, Z Zhang, Y Cao, H Hu, X Tong Proceedings of the IEEE/CVF International Conference on Computer Vision
, 2021 | 305 | 2021 |
A closer look at local aggregation operators in point cloud analysis Z Liu, H Hu, Y Cao, Z Zhang, X Tong Computer VisionECCV 2020: 16th European Conference, Glasgow, UK, August 23
, 2020 | 195 | 2020 |
Tutel: Adaptive mixture-of-experts at scale C Hwang, W Cui, Y Xiong, Z Yang, Z Liu, H Hu, Z Wang, R Salas, J Jose, ... Proceedings of Machine Learning and Systems 5, 269-287, 2023 | 54 | 2023 |
Human Pose as Compositional Tokens Z Geng, C Wang, Y Wei, Z Liu, H Li, H Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
, 2023 | 44 | 2023 |
Leveraging batch normalization for vision transformers Z Yao, Y Cao, Y Lin, Z Liu, Z Zhang, H Hu Proceedings of the IEEE/CVF International Conference on Computer Vision, 413-422, 2021 | 39 | 2021 |
FP8-LM: Training FP8 Large Language Models H Peng, K Wu, Y Wei, G Zhao, Y Yang, Z Liu, Y Xiong, Z Yang, B Ni, J Hu, ... arXiv preprint arXiv:2310.18313, 2023 | 15 | 2023 |
Improving CLIP Fine-tuning Performance Y Wei, H Hu, Z Xie, Z Liu, Z Zhang, Y Cao, J Bao, D Chen, B Guo Proceedings of the IEEE/CVF International Conference on Computer Vision
, 2023 | 10 | 2023 |
Could Giant Pre-trained Image Models Extract Universal Representations? Y Lin, Z Liu, Z Zhang, H Hu, N Zheng, S Lin, Y Cao Advances in Neural Information Processing Systems 35, 8332-8346, 2022 | 8 | 2022 |
Mixture-of-experts layer with dynamic gating Y Xiong, C Hwang, W Cui, Y Ziyue, Z Liu, H Hu, Z Wang, RO Salas, J Jose, ... US Patent App. 18/054,451, 2024 | | 2024 |
Collective communication phases at mixture-of-experts layer Y Xiong, C Hwang, W Cui, Y Ziyue, Z Liu, H Hu, Z Wang, RO Salas, J Jose, ... US Patent App. 18/054,452, 2024 | | 2024 |
Mixture-of-experts layer with switchable parallel modes Y Xiong, C Hwang, W Cui, Y Ziyue, Z Liu, H Hu, Z Wang, RO Salas, J Jose, ... US Patent App. 18/054,446, 2024 | | 2024 |
Sparse encoding and decoding at mixture-of-experts layer Y Xiong, C Hwang, W Cui, Y Ziyue, Z Liu, H Hu, Z Wang, RO Salas, J Jose, ... US Patent App. 18/318,436, 2024 | | 2024 |