Antonio J. Peña
Cited by
Cited by
rCUDA: Reducing the number of GPU-based accelerators in high performance clusters
J Duato, AJ Pena, F Silla, R Mayo, ES Quintana-Ortí
2010 International Conference on High Performance Computing & Simulation …, 2010
A Complete and Efficient CUDA-Sharing Solution for HPC Clusters
AJ Peña, C Reaño, F Silla, R Mayo, ES Quintana-Ortí, J Duato
Parallel Computing 40 (10), 574-588, 2014
Chai: collaborative heterogeneous applications for integrated-architectures
J Gómez-Luna, I El Hajj, LW Chang, V Garcıa-Flores, SG de Gonzalo, ...
2017 IEEE International Symposium on Performance Analysis of Systems and …, 2017
Enabling CUDA acceleration within virtual machines using rCUDA
J Duato, AJ Pena, F Silla, JC Fernandez, R Mayo, ES Quintana-Orti
High Performance Computing (HiPC), 2011 18th International Conference on, 1-10, 2011
An efficient implementation of GPU virtualization in high performance clusters
J Duato, FD Igual, R Mayo, AJ Peña, ES Quintana-Ortí, F Silla
Euro-Par 2009–Parallel Processing Workshops, 385-394, 2010
Performance evaluation of cudnn convolution algorithms on nvidia volta gpus
M Jorda, P Valero-Lara, AJ Pena
IEEE Access 7, 70461-70473, 2019
MPICH User’s Guide
A Amer, P Balaji, W Bland, W Gropp, R Latham, H Lu, L Oden, AJ Pena, ...
Version, 2015
Performance of CUDA virtualized remote GPUs in high performance clusters
J Duato, AJ Pena, F Silla, R Mayo, ES Quintana-Orti
Parallel Processing (ICPP), 2011 International Conference on, 365-374, 2011
MT-MPI: multithreaded MPI for many-core environments
M Si, AJ Peña, P Balaji, M Takagi, Y Ishikawa
Proceedings of the 28th ACM international conference on Supercomputing, 125-134, 2014
Casper: An Asynchronous Progress Model for MPI RMA on Many-Core Architectures
M Si, AJ Pena, J Hammond, P Balaji, M Takagi, Y Ishikawa
29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2015
Automating the Application Data Placement in Hybrid Memory Systems
H Servat, AJ Pena, G Llort, E Mercadal, HC Hoppe, J Labarta
Cluster Computing (CLUSTER), 2017 IEEE International Conference on, 126-136, 2017
Toward the Efficient Use of Multiple Explicitly Managed Memory Subsystems
AJ Pena, P Balaji
IEEE Cluster 2014, 2014
CU2rCU: Towards the complete rCUDA remote GPU virtualization and sharing solution
C Reaño, AJ Peña, F Silla, J Duato, R Mayo, ES Quintana-Orti
High Performance Computing (HiPC), 2012 19th International Conference on, 1-10, 2012
Influence of InfiniBand FDR on the Performance of Remote GPU Virtualization
C Reano, R Mayo, ES Quintana-Ortı, F Silla, J Duato, AJ Pena
IEEE Cluster 2013, 2013
Integrating blocking and non-blocking MPI primitives with task-based programming models
K Sala, X Teruel, JM Perez, AJ Peña, V Beltran, J Labarta
Parallel Computing 85, 153-166, 2019
Evaluating the effect of last-level cache sharing on integrated GPU-CPU systems with heterogeneous applications
V Garcıa, J Gomez-Luna, T Grass, A Rico, E Ayguade, AJ Pena
2016 IEEE International Symposium on Workload Characterization (IISWC), 1-10, 2016
Exploring the Vision Processing Unit as Co-Processor for Inference
S Rivas-Gomez, AJ Pena, D Moloney, E Laure, S Markidis
2018 IEEE International Parallel and Distributed Processing Symposium …, 2018
MultiCL: Enabling Automatic Scheduling for Task-Parallel Workloads in OpenCL
AM Aji, AJ Peña, P Balaji, W Feng
Parallel Computing 58, 37-55, 2016
DMRlib: easy-coding and efficient resource management for job malleability
S Iserte, R Mayo, ES Quintana-Ortí, AJ Pena
IEEE Transactions on Computers 70 (9), 1443-1457, 2020
Enabling homomorphically encrypted inference for large dnn models
G Lloret-Talavera, M Jorda, H Servat, F Boemer, C Chauhan, ...
IEEE Transactions on Computers 71 (5), 1145-1155, 2021
The system can't perform the operation now. Try again later.
Articles 1–20