Dr. Michel Steuwer

Dr. Michel Steuwer

Einsteinstr. 62
48149 Münster

External Profile
  • Research Foci

    Algorithmic skeletons

    SkelCL – A Skeleton Library for Heterogeneous Systems

    http://skelcl.uni-muenster.de

    GPU computing

    Parallel computing

  •  
    • CV

      Academic Education

      Ph.D. studies in computer science
      Computer science graduate program (Diploma degree)

      Positions

      Research Associate at the University of Edinburgh
      Visiting researcher at the University of Edinburgh
      Research associate at the University of Münster
      Visiting researcher at the University of Edinburgh
      Visiting researcher at the University of Edinburgh
      Visiting researcher at the University of Edinburgh
      Student assistant at the University of Münster
  •  

    Publications

    Research Article in Edited Proceeding (Conference)
    Research Article (Book Contribution)
    • , , , , , and . . “Skeleton Programming for Portable Many-Core Computing.” in Programming Multicore and Many-core Computing Systems, edited by Sabri Pllana and Fatos Xhafa. John Wiley & Sons.

    Research Articles (Journals)
    Research Articles in Edited Proceedings (Conferences)

    Research Articles (Journals)
    Research Article in Edited Proceeding (Conference)
    Research Article (Book Contribution)

    Research Articles in Edited Proceedings (Conferences)
    Research Article (Book Contribution)
    • , , , , , and . . “Skeleton Programming for Portable Many-Core Computing.” in Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, edited by Sabri Pllana and Fatos Xhafa. Wiley-Blackwell.

    • , , and . . Angewandte Mathematik und Informatik, Vol.04/10 - I, SkelCL - A Portable Multi-GPU Skeleton Library Münster: University Münster.
  •  

    Doctoral AbstractThesis

    Improving Programmability and Performance Portability on Many-Core Processors

    Supervisor
    Prof. Dr. Sergei Gorlatch
    Doctoral Subject
    Informatik
    Doctoral Degree
    Dr. rer. nat.
    Awarded by
    Department 10 – Mathematics and Computer Science
    Computer processors have radically changed in the recent 20 years with multi- and many-core architectures emerging to address the in- creasing demand for performance and energy efficiency. Multi-core CPUs and Graphics Processing Units (GPUs) are currently widely programmed with low-level, ad-hoc, and unstructured programming models, like multi-threading or OpenCL/CUDA. Developing functionally correct applications using these approaches is challenging as they do not shield programmers from complex issues of parallelism, like deadlocks or non-determinism. Developing optimized parallel programs is an even more demanding task – even for experienced programmers. Optimizations are often applied ad-hoc and exploit specific hardware features making them non-portable.

    In this thesis we address these two challenges of programmability and performance portability for modern parallel processors.

    In the first part of the thesis, we present the SkelCL programming model which addresses the programmability challenge. SkelCL introduces three main high-level features which simplify GPU programming:
    1) parallel container data types simplify the data management in GPU systems;
    2) regular patterns of parallel programming (a. k. a., algorithmic skeletons) simplify the programming by expressing parallel computation in a structured way;
    3) data distributions simplify the programming of multi-GPU systems by automatically managing data across all the GPUs in the system.
    We present a C++ library im- plementation of our programming model and we demonstrate in an experimental evaluation that SkelCL greatly simplifies GPU programming without sacrificing performance.

    In the second part of the thesis, we present a novel compilation technique which addresses the performance portability challenge. We introduce a novel set of high-level and low-level parallel patterns along with a set of rewrite rules which systematically express high-level algorithmic implementation choices as well as low-level, hardware- specific optimizations. By applying the rewrite rules pattern-based programs are transformed from a single portable high-level representation into hardware-specific low-level expressions from which efficient OpenCL code is generated. We formally prove the soundness of our approach by showing that the rewrite rules do not change the program semantics. Furthermore, we experimentally confirm that our novel compilation technique can transform a single portable expression into highly efficient code for three different parallel processors, thus, providing performance portability.
  •  
  • Supervised Theses

    Summer Semester 2014

    • Bachelor Thesis: Evaluation of the Skeleton Library FastFlow
    • Bachelor Thesis: A parallel Implementation of the T-CUP Software with the SkelCL Library

    Winter Semester 2013/14

    • Master Thesis: Development of a Divide & Conquer Skeleton for SkelCL
    • Master Thesis: A GPU-based Classification Framework for HIV-Resistance Prediction
    • Master Thesis: Extending the SkelCL Library with a Skeleton for Stencil Computations
    • Bachelor Thesis: Autotuning of the Work-Group Size of OpenCL Programs

    Summer Semester 2013

    • Master Thesis: A Model for Predicting Work Distribution in Heterogeneous Systems and its Implementation in the SkelCL Library
    • Bachelor Thesis: Implementation of the Needleman-Wunsch Algorithem and the Breath-First-Search with the SkelCL-Library
    • Bachelor Thesis: Evaluation of the Skeleton Library SkePU

    Winter Semester 2012/13

    • Master Thesis: Extending the Skeleton Library SkelCL with a Skeleton for All-Pairs Computations
    • Bachelor Thesis: Implementing the LU-Decomposition and the Mersenne-Twister with the SkelCL Library
    • Bachelor Thesis: Performance Analysis of SkelCL using B+-Tree Traversal and 3D Jacobi Stencil
    • Diploma Thesis: Simulation and Analysis of Twodimensional Turbulences on Parallel Computerarchitectures

    Summer Semester 2012

    • Diploma Thesis: Extending the SkelCL Library with Multidimensional Data Types

    Summer Semester 2011

    • Bachelor Thesis: Analyse the Usage of GPUs for Implementing Radixsort
    • Bachelor Thesis: Extending the SkelCL Library with Iterators
    • Bachelor Thesis: Improving the MapOverlap Skeleton in SkelCL
    • Bachelor Thesis: Development of a Library for Manipulating Source Code of C-based Languages