Publications

Publications

2019

  • Sergei Shudler, Yannick Berens, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf: Engineering Algorithms for Scalability through Continuous Validation of Performance Expectations. IEEE Transactions on Parallel and Distributed Systems, 2019, (accepted). [BibTeX]

  • Rahim Mammadli, Felix Wold, Ali Jannesari: The Art of Getting Deep Neural Networks in Shape. ACM Transactions on Architecture and Code Optimization (TACO), 15(4):62:1-62:21, January 2019. [PDF] [DOI] [BibTeX]

  • Gabriele Mencagli, Dora B. Heras, Valeria Cardellini, Emiliano Casalicchio, Emmanuel Jeannot, Felix Wolf, Antonio Salis, Claudio Schifanella, Ravi Reddy Manumachu, Laura Ricci, Marco Beccuti, Laura Antonelli, José Daniel Garcia Sanchez, Stephen L. Scott (eds.), Euro-Par 2018: Parallel Processing Workshops, volume 11339 of Lecture Notes in Computer Science, Springer International Publishing, January 2019. [URL] [BibTeX]

2018

  • Sergei Shudler, Jadran Vrabec, Felix Wolf: Understanding the Scalability of Molecular Simulation using Empirical Performance Modeling. In Proc. of the 7th Workshop on Extreme Scale Programming Tools (ESPT), held in conjunction with the Supercomputing Conference (SC18), Dallas, TX, USA. 2018, (to appear) [BibTeX]

  • Philip C. Roth, Kevin Huck, Ganesh Gopalakrishnan, Felix Wolf: Using Deep Learning for Automated Communication Pattern Characterization: Little Steps and Big Challenges. In Proc. of the 5th Workshop on Visual Performance Analysis (VPA), held in conjunction with the Supercomputing Conference (SC18), Dallas, TX, USA. 2018, (to appear). [BibTeX]

  • Michael Burger, Christian Bischof, Alexandru Calotoiu, Felix Wolf, Thomas Wunderer, Johannes Buchmann: Exploring the Performance Envelope of the LLL Algorithm. In Proc. of the 21st IEEE International Conference of Computational Science and Engineering (CSE), Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Romania, IEEE Computer Society, October 2018. [DOI] [BibTeX]

  • Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, Felix Wolf: Lightweight Requirements Engineering for Exascale Co-design. In Proc. of the IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK, pages 1-11, IEEE Computer Society, September 2018. [PDF] [DOI] [BibTeX ]

  • Sebastian Rinke, Markus Butz-Ostendorf, Marc-André Hermanns, Mikaël Naveau, Felix Wolf: A Scalable Algorithm for Simulating the Structural Plasticity of the Brain. Journal of Parallel and Distributed Computing, 120:251-266, 2018. [PDF] [DOI] [BibTeX]

  • Aamer Shah, Matthias S. Müller, Felix Wolf: Estimating the impact of external interference on application performance. In Proc. of the 24th Euro-Par Conference, Turin, Italy, Springer, August 2018, volume 11014 of Lecture Notes in Computer Science, pages 46-58, Springer, August 2018. [PDF] [DOI] [BibTeX]

  • Arya Mazaheri, Felix Wolf, Ali Jannesari: Unveiling Thread Communication Bottlenecks Using Hardware-Independent Metrics. In Proc. of the 47th International Conference on Parallel Processing (ICPP), Eugene, OR, USA, pages 1-11. ACM, August 2018. [PDF] [DOI] [BibTeX]

  • Gabriele Mencagli, Dora B. Heras, Valeria Cardellini, Emiliano Casalicchio, Emmanuel Jeannot, Felix Wolf, Antonio Salis, Claudio Schifanella, Ravi Reddy Manumachu, Laura Ricci, Marco Beccuti, Laura Antonelli, José Daniel Garcia Sanchez, Stephen L. Scott (eds.), Euro-Par 2018: Parallel Processing Workshops, volume 11339 of Lecture Notes in Computer Science, Springer International Publishing, August 2018. [BibTeX]

  • Rohit Atre, Zia Ul Huda, Felix Wolf, Ali Jannesari: Dissecting Sequential Programs for Parallelization – An Approach Based on Computational Units. Concurrency and Computation: Practice and Experience, June 2018. [PDF] [DOI] [BibTeX]

  • Sergei Shudler: Scalability Engineering for Parallel Programs Using Empirical Performance Models. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, June 2018. [URL] [BibTeX]

  • Suraj Prabhakaran, Marcel Neumann, Felix Wolf: Efficient Fault Tolerance through Dynamic Node Replacement. In Proc. of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Washington, DC, USA, pages 163-172, IEEE, May 2018. [PDF] [DOI] [BibTeX]

  • Sebastian Rinke: A Scalable Parallel Algorithm for the Simulation of Structural Plasticity in the Brain. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, May 2018. [URL] [BibTeX]

2017

  • Alexandru Calotoiu: Automatic Empirical Performance Modeling of Parallel Programs. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, October 2017. [URL] [BibTeX]

  • Sebastian Rinke, Mikaël Naveau, Felix Wolf, Markus Butz-Ostendorf: The Rewiring Brain: A Computational Approach to Structural Plasticity in the Adult Brain, chapter Critical Periods Emerge from Homeostatic Structural Plasticity in a Full-Scale Model of the Developing Cortical Column. Academic Press, San Diego, pages 177-202, 2017. [BibTeX]

  • Marc-André Hermanns, Markus Geimer, Bernd Mohr, Felix Wolf: Trace-based Detection of Lock Contention in MPI One-Sided Communication. In Tools for High Performance Computing 2016, Proc. of the 10th Parallel Tools Workshop, Stuttgart, Germany, October 2016, pages 97–114, Springer, 2017. [URL] [DOI] [BibTeX]

  • Bernd Mohr, Felix Wolf: The Virtual Institute – High-Productivity Supercomputing Celebrates its 10th Anniversary. Innovatives Supercomputing in Deutschland (inSiDE), 15(2):40–41, 2017. [URL] [BibTeX]

  • Patrick Reisert, Alexandru Calotoiu, Sergei Shudler, Felix Wolf: Following the Blind Seer – Creating Better Performance Models Using Less Information. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain, volume 10417 of Lecture Notes in Computer Science, pages 106–118, Springer, August 2017. [PDF] [DOI] [BibTeX]

  • Kashif Ilyas, Alexandru Calotoiu, Felix Wolf: Off-Road Performance Modeling – How to Deal with Segmented Data. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain, volume 10417 of Lecture Notes in Computer Science, pages 36–48, Springer, August 2017. [PDF] [DOI] [BibTeX]

  • Rohit Atre, Ali Jannesari, Felix Wolf: Meeting the challenges of parallelizing sequential programs. In Proc. of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), Washington, DC, USA, pages 363–365, ACM, July 2017. [PDF] [DOI] [BibTeX]

  • Rohit Atre, Zia Ul Huda, Ali Jannesari, Felix Wolf: Dissecting sequential programs for parallelization – an approach based on computational units. In 10th International Symposium on High-Level Parallel Programming and Applications, Valladolid, Spain, pages 1-18, July 2017. [BibTeX]

  • Daniel Lorenz, Christian Feld: Scaling Score-P to the next level. In Proc. of the International Converence of Computational Science Workshops, pages 2180–-2189, Elsevier, June 2017. [PDF] [DOI] [BibTeX]

  • Ali Jannesari: A Software Development Methodology for Multicore Systems. Habilitation, Technische Universität Darmstadt, Darmstadt, Germany, June 2017. [URL] [BibTeX]

  • Ali Jannesari, Zia Ul Huda, Rohit Atre, Zhen Li, Felix Wolf: Parallelizing Audio Analysis Applications – A Case Study. In Proc. of the 39th International Conference on Software Engineering, Software Engineering Education and Training Track (ICSE-SEET), pages 57–66, May 2017. [PDF] [URL] [DOI] [BibTeX]

  • Ali Jannesari, Felix Wolf, Walter Tichy (eds.): Special Issue on Software Engineering for Parallel Systems. Journal of Systems and Software, 125:380–448, March 2017. [URL] [DOI] [BibTeX]

  • Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Felix Wolf: Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Austin, TX, USA, pages 131–143, ACM, February 2017. [PDF] [DOI] [BibTeX]

2016

  • Felix Wolf, Christian Bischof, Alexandru Calotoiu, Torsten Hoefler, Christian Iwainsky, Grzegorz Kwasniewski, Bernd Mohr, Sergei Shudler, Alexandre Strube, Andreas Vogel, Gabriel Wittum: Software for Exascale Computing – SPPEXA 2013-2015, chapter Automatic Performance Modeling of HPC Applications. Springer International Publishing, pages 445–465, September 2016. [DOI] [BibTeX]

  • Andreas Vogel, Alexandru Calotoiu, Arne Nägel, Sebastian Reiter, Alexandre Strube, Gabriel Wittum, Felix Wolf: Software for Exascale Computing – SPPEXA 2013-2015, chapter Automated Performance Modeling of the UG4 Simulation Framework. Springer International Publishing, pages 467–481, September 2016. [DOI] [BibTeX]

  • Zhen Li: Discovery of Potential Parallelism in Sequential Programs. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, November 2016. [URL] [BibTeX]

  • Sebastian Rinke, Markus Butz-Ostendorf, Marc-André Hermanns, Mikaël Naveau, Felix Wolf: A Scalable Algorithm for Simulating the Structural Plasticity of the Brain. In Proc. of the 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Los Angeles, CA, USA, pages 1-8, October 2016. [PDF] [DOI] [BibTeX]

  • Suraj Prabhakaran: Dynamic Resource Management and Job Scheduling for High Performance Computing. PhD thesis, Technische Universität Darmstadt, Darmstadt, Germany, October 2016. [URL] [BibTeX]

  • Alexandru Calotoiu, David Beckingsale, Christopher W. Earl, Torsten Hoefler, Ian Karlin, Martin Schulz, Felix Wolf: Fast Multi-Parameter Performance Modeling. In Proc. of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan, pages 172-181, IEEE Computer Society, September 2016. [PDF] [DOI] [BibTeX]

  • Zhen Li, Rohit Atre, Zia Ul Huda, Ali Jannesari, Felix Wolf: Unveiling Parallelization Opportunities in Sequential Programs. Journal of Systems and Software, 117:282–295, July 2016. [PDF] [DOI] [BibTeX]

  • David Böhme, Markus Geimer, Lukas Arnold, Felix Voigtländer, Felix Wolf: Identifying the root causes of wait states in large-scale parallel applications. ACM Transactions on Parallel Computing, 3(2):Article No. 11, 24 pages, July 2016. [PDF] [DOI] [BibTeX]

  • Zia Ul Huda, Rohit Atre, Ali Jannesari, Felix Wolf: Automatic Parallel Pattern Detection in the Algorithm Structure Design Space. In Proc. of the 30th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, USA, pages 43-52, IEEE Computer Society, May 2016. [PDF] [URL] [DOI] [BibTeX]

  • Ali Jannesari, Felix Wolf: Automatic Generation of Unit Tests for Correlated Variables in Parallel Programs. International Journal of Parallel Programming (IJPP), 44(3):644–662, March 2016. [PDF] [URL] [DOI] [BibTeX]

  • Thireshan Jeyakumaran, Ehsan Atoofian, Yang Xiao, Zhen Li, Ali Jannesari: Improving Performance of Transactional Applications through Adaptive Transactional Memory. In Proc. of the 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Heraklion, Crete, Greece, pages 192-199, February 2016. [PDF] [DOI] [BibTeX]

  • Monika Harlacher, Alexandru Calotoiu, John Dennis, Felix Wolf: Analysing the Scalability of Climate Codes Using New Features of Scalasca. In Proc. of the John von Neumann Institute for Computing (NIC) Symposium 2016, Juelich, Germany, volume 48 of NIC Series, pages 343-352. Forschungszentrum Jülich, John von Neumann-Institut for Computing, February 2016. [BibTeX]

2015

  • Zhen Li, Rohit Atre, Zia Ul-Huda, Ali Jannesari, Felix Wolf: DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities. In Tools for High Performance Computing 2014, Proc. of the 8th Parallel Tools Workshop,Stuttgart, Germany, October 2014, chapter 3, pages 37-54, Springer International Publishing, 2015. [PDF] [URL] [DOI] [BibTeX]

  • Laura von Rüden, Marc-André Hermanns, Michael Behrisch, Daniel Keim, Bernd Mohr, Felix Wolf: Separating the Wheat from the Chaff: Identifying Relevant and Similar Performance Data with Visual Analytics. In Proc. of the 2nd Workshop on Visual Performance Analysis (VPA), held in conjunction with the Supercomputing Conference (SC15), Austin, TX, USA, pages 4:1–4:8, ACM, 2015. [PDF] [URL] [DOI] [BibTeX]

  • Zhen Li, Bo Zhao, Ali Jannesari, Felix Wolf: Beyond Data Parallelism: Identifying Parallel Tasks in Sequential Programs. In Proc. of 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Zhangjiajie, China, volume 9531 of Lecture Notes in Computer Science, pages 569-582, Springer International Publishing, November 2015. [PDF] [DOI] [BibTeX]

  • Zhen Li, Michael Beaumont, Ali Jannesari, Felix Wolf: Fast Data-Dependence Profiling by Skipping Repeatedly Executed Memory Operations. In Proc. of 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Zhangjiajie, China, volume 9531 of Lecture Notes in Computer Science, pages 583-596, Springer International Publishing, November 2015. [PDF] [DOI] [BibTeX]

  • Yang Xiao, Zhen Li, Ehsan Atoofian, Ali Jannesari: Automatic Optimization of Software Transactional Memory through Linear Regression and Decision Tree. In Proc. of 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Zhangjiajie, China, volume 9531 of Lecture Notes in Computer Science, pages 61-73, Springer International Publishing, November 2015. [PDF] [DOI] [BibTeX]

  • Daniel Lorenz, Sergei Shudler, Felix Wolf: Preventing the explosion of exascale profile data with smart thread-level aggregation. In Proc. of ESPT2015: Workshop on Extreme Scale Programming Tools, held in conjunction with the Supercomputing Conference (SC15), Austin, TX, USA, pages 1–10, ACM, November 2015. [PDF] [DOI] [BibTeX]

  • Ali Jannesari, Siegfried Benkner, Xinghui Zhao, Ehsan Atoofian, Yukionri Sato: Workshop Preview of the 2nd International Workshop on Software for Parallel Systems (SEPS 2015). In Companion Proceedings of the 2015 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity of SPLASH Companion 2015, pages 95–96, New York, NY, USA, ACM, October 2015. [DOI] [BibTeX]

  • Arya Mazaheri, Ali Jannesari, Abdolreza Mirzaei, Felix Wolf: Characterizing Loop-Level Communication Patterns in Shared Memory Applications. In Proc. of the 44th International Conference on Parallel Processing (ICPP), Beijing, China, pages 759-768, September 2015. [PDF] [DOI] [BibTeX]

  • Ali Jannesari: Detection of High-Level Synchronization Anomalies in Parallel Programs. International Journal of Parallel Programming (IJPP), 43(4):656-678, August 2015. [PDF] [DOI] [BibTeX]

  • Andreas Vogel, Alexandru Calotoiu, Alexandre Strube, Sebastian Reiter, Arne Nägel, Felix Wolf, Gabriel Wittum: 10,000 Performance Models per Minute – Scalability of the UG4 Simulation Framework. In Proc. of the 21st Euro-Par Conference, Vienna, Austria, volume 9233 of Lecture Notes in Computer Science, pages 519–531, Springer, August 2015. [PDF] [DOI] [BibTeX]

  • Christian Iwainsky, Sergei Shudler, Alexandru Calotoiu, Alexandre Strube, Michael Knobloch, Christian Bischof, Felix Wolf: How Many Threads will be too Many? On the Scalability of OpenMP Implementations. In Proc. of the 21st Euro-Par Conference, Vienna, Austria, volume 9233 of Lecture Notes in Computer Science, pages 451–463, Springer, August 2015. [PDF] [DOI] [BibTeX]

  • Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf: Exascaling Your Library: Will Your Implementation Meet Your Expectations?. In Proc. of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA, pages 165–175, ACM, June 2015. [PDF] [DOI] [BibTeX]

  • Suraj Prabhakaran, Marcel Neumann, Sebastian Rinke, Felix Wolf, Abhishek Gupta, Laxmikant V. Kalé: A Batch System with Efficient Scheduling for Malleable and Evolving Applications. In Proc. of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Hyderabad, India, pages 429-438, IEEE Computer Society, May 2015. [PDF] [URL] [DOI] [BibTeX]

  • Zhen Li, Ali Jannesari, Felix Wolf: An Efficient Data-Dependence Profiler for Sequential and Parallel Programs. In Proc. of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Hyderabad, India, pages 484-493, IEEE Computer Society, May 2015. [PDF] [URL] [DOI] [BibTeX]

  • Jochen Schimmel, Korbinian Molitorisz, Ali Jannesari, Walter F. Tichy: Combining Unit Tests for Data Race Detection. In Proc. of 10th IEEE/ACM International Workshop on Automation of Software Test (AST 2015), pages 43-47, IEEE, May 2015. [PDF] [URL] [DOI] [BibTeX]

  • Mohammad Norouzi, Ali Jannesari: Resource and application-aware resource discovery in computing environments. The Journal of Supercomputing, 71(3):824-839, March 2015. [PDF] [URL] [DOI] [BibTeX]

  • Rohit Atre, Ali Jannesari, Felix Wolf: The Basic Building Blocks of Parallel Tasks. In Proc. of the International Workshop on Code Optimisation for Multi and Many Cores, San Francisco, CA, USA, pages 3:1–3:11, ACM, February 2015. [PDF] [URL] [DOI] [BibTeX]

  • Bo Zhao, Zhen Li, Ali Jannesari, Felix Wolf, Weiguo Wu: Dependence-Based Code Transformation for Coarse-Grained Parallelism. In Proc. of the International Workshop on Code Optimisation for Multi and Many Cores, San Francisco, CA, USA, pages 1:1–1:10, ACM, February 2015. [PDF] [DOI] [BibTeX]

  • Zia Ul Huda, Ali Jannesari, Felix Wolf: Using Template Matching to Infer Parallel Design Patterns. ACM Transactions on Architecture and Code Optimization, 11(4):64:1–64:21, January 2015.[PDF] [DOI] [BibTeX]