












|
|
-
T. Hoefler.
Principles for Coordinated Optimization of Computation and Communication in Large-Scale Parallel Systems.
PhD thesis,
Sep. 2008.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
LogGP in Theory and Practice - An In-depth Analysis of Modern Interconnection Networks and Benchmarking Methods for Collective Operations..
Elsevier Journal of Simulation Modelling Practice and Theory,
Jun. 2009.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
The Effect of Network Noise on Large-Scale Collective Communications.
Parallel Processing Letters (PPL),
Dec. 2009.
Note: Accepted for publication (in print).
[bibtex-entry]
-
T. Hoefler,
P. Gottschling,
A. Lumsdaine,
and W. Rehm.
Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations.
Elsevier Journal of Parallel Computing (PARCO),
33(9):624-633,
9 2007.
[bibtex-entry]
-
T. Hoefler,
R. Janisch,
and W. Rehm.
A Performance Analysis of ABINIT on a Cluster System,
pages 37-51.
Springer, Lecture Notes in Computational Science and Engineering,
12 2005.
[bibtex-entry]
-
T. Hoefler,
R. Janisch,
and W. Rehm.
Improving the parallel scaling of ABINIT,
pages 551-559.
CINECA Conzorzio Interuniversitario,
12 2005.
[bibtex-entry]
-
T. Hoefler,
A. Lumsdaine,
and J. Dongarra.
Towards Efficient MapReduce Using MPI.
In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 16th European PVM/MPI Users' Group Meeting,
Sep. 2009.
Springer.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
Optimized Routing for Large-Scale InfiniBand Networks.
In 17th Annual IEEE Symposium on High Performance Interconnects (HOTI 2009),
Aug. 2009.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
A Power-Aware, Application-Based, Performance Study Of Modern Commodity Cluster Interconnection Networks.
In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium (IPDPS), CAC Workshop,
May 2009.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
The Impact of Network Noise at Large-Scale Communication Performance.
In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium (IPDPS), LSPP Workshop,
May 2009.
[bibtex-entry]
-
T. Hoefler,
C. Siebert,
and A. Lumsdaine.
Group Operation Assembly Language - A Flexible Way to Express Collective Communication.
In In proceedings of the 38th International Conference on Parallel Processing (ICPP-2009),
Sep. 2009.
[bibtex-entry]
-
T. Hoefler and J. L. Traeff.
Sparse Collective Operations for MPI.
In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium (IPDPS), HIPC Workshop,
May 2009.
[bibtex-entry]
-
C. Kaiser,
T. Hoefler,
B. Bierbaum,
and T. Bemmerl.
Implementation and Analysis of Nonblocking Collective Operations on SCI Networks.
In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium (IPDPS), CAC Workshop,
May 2009.
[bibtex-entry]
-
Prabhanjan Kambadur,
Torsten Hoefler,
Anshul Gupta,
and Andrew Lumsdaine.
Demand-driven Execution of Static Direct Acyclic Graphs Using Task Parallelism.
In International Conference on High Performance Computing (HiPC),
Kochi, India,
December 2009.
[bibtex-entry]
-
J. Mueller,
T. Schneider,
J. Domke,
R. Geyer,
M. Haesing,
T. Hoefler,
S. Hoehlig,
G. Juckeland,
A. Lumsdaine,
M. Mueller,
and and W. Nagel.
Cluster Challenge 2008: Optimizing Cluster Configuration and Applications to Maximize Power Efficiency.
In ,
Mar. 2009.
Note: 2nd best paper at the 10th LCI International Conference on High-Performance Clustered Computing.
[bibtex-entry]
-
P. Geoffray and T. Hoefler.
Adaptive Routing Strategies for Modern High Performance Networks.
In 16th Annual IEEE Symposium on High Performance Interconnects (HOTI 2008),
pages 165-172,
Aug. 2008.
IEEE Computer Society.
[bibtex-entry]
-
T. Hoefler,
P. Gottschling,
and A. Lumsdaine.
Leveraging Non-blocking Collective Communication in High-performance Applications.
In SPAA'08, Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures,
pages 113-115,
06 2008.
Association for Computing Machinery (ACM).
[bibtex-entry]
-
T. Hoefler,
F. Lorenzen,
and A. Lumsdaine.
Sparse Non-Blocking Collectives in Quantum Mechanical Calculations.
In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting,
volume LNCS 5205,
pages 55-63,
Sep. 2008.
Springer.
[bibtex-entry]
-
T. Hoefler and A. Lumsdaine.
Optimizing non-blocking Collective Operations for InfiniBand.
In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium (IPDPS),
04 2008.
[bibtex-entry]
-
T. Hoefler and A. Lumsdaine.
Message Progression in Parallel Computing - To Thread or not to Thread?.
In Proceedings of the 2008 IEEE International Conference on Cluster Computing,
Oct. 2008.
IEEE Computer Society.
[bibtex-entry]
-
T. Hoefler and A. Lumsdaine.
Overlapping Communication and Computation with High Level Communication Routines.
In Proceedings of the 8th IEEE Symposium on Cluster Computing and the Grid (CCGrid 2008),
05 2008.
[bibtex-entry]
-
T. Hoefler,
M. Schellmann,
S. Gorlatch,
and A. Lumsdaine.
Communication Optimization for Medical Image Reconstruction Algorithms.
In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting,
volume LNCS 5205,
pages 75-83,
Sep. 2008.
Springer.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks.
In Proceedings of the 2008 IEEE International Conference on Cluster Computing,
Oct. 2008.
IEEE Computer Society.
[bibtex-entry]
-
T. Hoefler,
T. Schneider,
and A. Lumsdaine.
Accurately Measuring Collective Operations at Massive Scale.
In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium (IPDPS),
04 2008.
[bibtex-entry]
-
T. Schneider,
T. Hoefler,
S. Wunderlich,
T. Mehlan,
and W. Rehm.
An optimized ZGEMM implementation for the Cell BE.
In Proceedings of the 9th Workshop on Parallel Systems and Algorithms (PASA),
02 2008.
[bibtex-entry]
-
A. Friedley,
T. Hoefler,
M. Leininger,
and A. Lumsdaine.
Scalable High Performance Message Passing over InfiniBand for Open MPI.
In Proceedings of 3rd KiCC Workshop 2007,
12 2007.
RWTH Aachen.
[bibtex-entry]
-
Torsten Hoefler,
Prabhanjan Kambadur,
Richard L Graham,
Galen Shipman,
and Andrew Lumsdaine.
A Case For Standard Non-blocking Collective Operations.
In Proceedings of the 14th European PVM/MPI Users' Group Meeting,
LNCS,
Paris, France,
pages 125--134,
2007.
[bibtex-entry]
-
T. Hoefler,
A. Lichei,
and W. Rehm.
Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks.
In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium,
03 2007.
IEEE Computer Society.
[bibtex-entry]
-
T. Hoefler,
C. Siebert,
and W. Rehm.
A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast.
In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium,
pages 232,
03 2007.
IEEE Computer Society.
[bibtex-entry]
-
Torsten Hoefler,
Peter Gottschling,
Wolfgang Rehm,
and Andrew Lumsdaine.
Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations.
In ParSim 2006 Workshop,
September 2006.
[bibtex-entry]
-
T. Hoefler,
P. Gottschling,
W. Rehm,
and A. Lumsdaine.
Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations.
In Recent Advantages in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI User's Group Meeting, Proceedings, LNCS 4192,
pages 374-382,
9 2006.
Springer.
[bibtex-entry]
-
T. Hoefler,
R. Janisch,
and W. Rehm.
Parallel scaling of Teter's minimization for Ab Initio calculations.
In ,
11 2006.
Note: Presented at the workshop HPC Nano in conjunction with SC'06.
[bibtex-entry]
-
T. Hoefler,
T. Mehlan,
F. Mietke,
and W. Rehm.
Adding Low-Cost Hardware Barrier Support to Small Commodity Clusters.
In Proceedings of 19th International Conference on Architecture and Computing Systems - ARCS'06,
pages 343-250,
3 2006.
[bibtex-entry]
-
T. Hoefler,
T. Mehlan,
F. Mietke,
and W. Rehm.
LogfP - A Model for small Messages in InfiniBand.
In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS),
4 2006.
[bibtex-entry]
-
T. Hoefler,
T. Mehlan,
F. Mietke,
and W. Rehm.
Fast Barrier Synchronization for InfiniBand.
In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS),
4 2006.
[bibtex-entry]
-
T. Hoefler,
J. Squyres,
G. Fagg,
G. Bosilca,
W. Rehm,
and A. Lumsdaine.
A New Approach to MPI Collective Communication Implementations.
In ,
09 2006.
Note: Accepted for Publication at the 6th Austrian-Hungarian Workshop on Distributed and Parallel Systems.
[bibtex-entry]
-
T. Hoefler,
J. Squyres,
W. Rehm,
and A. Lumsdaine.
A Case for Non-Blocking Collective Operations.
In Frontiers of High Performance Computing and Networking - ISPA 2006 Workshops,
volume 4331/2006,
pages 155-164,
12 2006.
Springer Berlin / Heidelberg.
[bibtex-entry]
-
T. Hoefler,
C. Viertel,
T. Mehlan,
F. Mietke,
and W. Rehm.
Assessing Single-Message and Multi-Node Communication Performance of InfiniBand.
In Proceedings of IEEE Inernational Conference on Parallel Computing in Electrical Engineering, PARELEC 2006,
pages 227-232,
9 2006.
IEEE Computer Society.
[bibtex-entry]
-
T. Mehlan,
J. Strunk,
T. Hoefler,
F. Mietke,
and W. Rehm.
IRS - A portable Interface for Reconfigurable Systems.
In Proceedings of IEEE Inernational Conference on Parallel Computing in Electrical Engineering, PARELEC 2006,
pages 187-191,
9 2006.
IEEE Computer Society.
[bibtex-entry]
-
F. Mietke,
R. Baumgartl,
R. Rex,
T. Mehlan,
T. Hoefler,
and W. Rehm.
Analysis of the Memory Registration Process in the Mellanox InfiniBand Software Stack.
In Euro-Par 2006 Parallel Processing,
pages 124-133,
8 2006.
Springer-Verlag Berlin.
[bibtex-entry]
-
T. Hoefler,
L. Cerquetti,
T. Mehlan,
F. Mietke,
and W. Rehm.
A practical approach to the rating of barrier algorithms using the LogP model and Open-MPI.
In Proceedings of the 2005 International Conference on Parallel Processing Workshops,
pages 562--569,
06 2005.
[bibtex-entry]
-
T. Hoefler and W. Rehm.
A Communication Model for Small Messages with InfiniBand.
In PARS Mitteilungen,
pages 32-41,
06 2005.
PARS.
Note: (Awarded with the PARS Junior Researcher Prize).
[bibtex-entry]
-
Torsten Hoefler,
Jeffrey M. Squyres,
Torsten Mehlan,
Frank Mietke,
and Wolfgang Rehm.
Implementing a Hardware-Based Barrier in Open MPI.
In Proceedings of KiCC'05, Chemnitzer Informatik Berichte,
November 2005.
[bibtex-entry]
-
T. Hoefler.
The Cell Processor.
In 22C3 Proceedings,
pages 286-292,
12 2005.
[bibtex-entry]
-
T. Hoefler.
Remote Network Analysis.
In 21C3 Proceedings,
pages 33-37,
12 2004.
[bibtex-entry]
-
D. Gregor,
T. Hoefler,
B. Barrett,
and A. Lumsdaine.
Fixing Probe for Multi-Threaded MPI Applications.
Technical report 674,
Indiana University,
Jan. 2009.
[bibtex-entry]
-
T. Schneider,
T. Hoefler,
and A. Lumsdaine.
ORCS: An Oblivious Routing Congestion Simulator.
Technical report 675,
Indiana University,
Feb. 2009.
[bibtex-entry]
-
T. Schneider,
T. Hoefler,
and A. Lumsdaine.
ORCS: An Oblivious Routing Congestion Simulator.
Technical report 675,
Indiana University,
Feb. 2009.
[bibtex-entry]
-
T. Hoefler and A. Lumsdaine.
Design, Implementation, and Usage of LibNBC.
Technical report,
Open Systems Lab, Indiana University,
08 2006.
[bibtex-entry]
-
T. Hoefler,
M. Reinhardt,
F. Mietke,
T. Mehlan,
and W. Rehm.
Low Overhead Ethernet Communication for Open MPI on Linux Clusters.
Technical report 06,
TU Chemnitz,
7 2006.
[bibtex-entry]
-
T. Hoefler,
J. Squyres,
G. Bosilca,
G. Fagg,
A. Lumsdaine,
and W. Rehm.
Non-Blocking Collective Operations for MPI-2.
Technical report,
Open Systems Lab, Indiana University,
08 2006.
[bibtex-entry]
-
T. Hoefler and G. Zerah.
Optimization of a parallel 3d-FFT with non-blocking collective operations.
Invited presentation at the 3rd International ABINIT Developer Workshop, Liege, Belgium,
01 2007.
[bibtex-entry]
-
T. Hoefler.
Application Optimization with non-blocking Collectives.
Presentation at parallel applications group at the Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM), Bruyeres-le-chatel, France,
01 2007.
[bibtex-entry]
-
T. Hoefler.
Non-Blocking Collectives for MPI-2.
Presentation at parallel systems group at the Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM), Bruyeres-le-chatel, France,
01 2007.
[bibtex-entry]
-
T. Hoefler.
Evaluation of publicly available Barrier-Algorithms and Improvement of the Barrier-Operation for large-scale Cluster-Systems with special Attention on InfiniBand Networks,
Apr. 2005.
[bibtex-entry]
BACK TO INDEX
Disclaimer:
This material is presented to ensure timely dissemination of
scholarly and technical work. Copyright and all rights therein
are retained by authors or by other copyright holders.
All person copying this information are expected to adhere to
the terms and constraints invoked by each author's copyright.
In most cases, these works may not be reposted
without the explicit permission of the copyright holder.
Last modified: Fri Nov 20 16:50:24 2009
Author: dikim.
This document was translated from BibTEX by
bibtex2html
|