Operating-System Support for Efficient Fine-Grained Concurrency in Applications

2020 
The steadily advancing trend towards multi- and manycore computing architectures bears enormous challenges for developers of application software. To be able to make efficient use of the raw parallelism provided by the hardware, programs must explicitly cater for that fact. The classic programming model of a multithreaded application process, which consists of a number of control flows (threads) managed and scheduled by the operating-system kernel within a shared address space, is being increasingly stretched to its limits: on the one hand, creating threads and switching between them is not sufficiently lightweight; on the other hand, structuring a parallel application around threads is often cumbersome and puts needless obstacles in the programmer’s way. A suitable alternative to multithreaded programming is the use of a so-called concurrency platform that supports developers in articulating applications as a conglomeration of fine-grained concurrent activities. Concurrency platforms come with a runtime system that is responsible for dispatching the lightweight work packages to the available computing resources. Such runtime systems generally build upon the abstractions provided by an underlying commodity operating system such as Linux – that is, upon threads as abstractions of processor cores. This construction results in a number of disadvantages: for instance, the operating system’s scheduler acts without consulting the runtime system, thus making decisions that are potentially unfavourable from the application’s point of view; the coexistence of multiple parallel application processes causes problematic reciprocal interference; blocking system calls cause a temporary loss of parallelism. This thesis presents AtroPOS, the design of an atrophied parallel operating system that is specially geared towards supporting concurrency platforms on manycore systems. AtroPOS is a derivative of the OctoPOS operating system and has undergone comprehensive further development; it rests on the paradigm of invasive computing and adopts its fundamental concepts: resource-aware programming, exclusive allocation of processor cores to applications, tailoring and dynamic reconfigurability. The operating-system kernel provides a boiled-down set of essential low-level abstractions on top of which arbitrary runtime libraries can be placed. InvRT, the invasive runtime system that supports executing applications of invasive computing, was developed as a reference runtime library. By default, AtroPOS makes the existing physical processor cores directly available to the application; their virtualisation is strictly optional and there is no notion of threads. The scheduling of user control flows is carried out purely on the user level by the runtime system without involving the operating-system kernel; this allows for the efficient handling even of very fine-grained concurrency within the application. System calls that may block within the kernel have asynchronous invocation semantics and return immediately upon blocking so that loss of parallelism during the waiting time is ruled out by design. Notification of completed system operations is carried out by means of a generic mechanism that passes user-defined data structures upward to the application and can be used by the runtime system to construct arbitrary synchronisation data structures such as futures. The same versatile mechanism is harnessed on tiled computing systems to allow parts of a distributed application to communicate with one another. In addition, AtroPOS offers configurable vertical isolation: the strict separation of the operating-system kernel from the application can be enabled and disabled in a coarse- and fine-grained manner, and both statically and dynamically. With this, type-safe applications can issue system calls as ordinary function calls and thus lower their direct and indirect costs. The aforementioned concepts were implemented in the AtroPOS kernel and the InvRT runtime system in the context of this thesis; they were evaluated with the aid of micro-benchmarks and various application suites. Moreover, the runtime library of the parallel programming language Cilk Plus – an extension of C/C++ – was ported to the AtroPOS interface in order to showcase the versatility of the approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []