ANR-NRF ITrans project (2017-2021)
Overview
Large, real-world software must continually change, to keep up with evolving requirements, fix bugs, and improve performance, maintainability, and security. This rate of change can pose difficulties for clients, whose code cannot always evolve at the same rate. This project will target the problems of forward porting, where one software component has to catch up to a code base with which it needs to interact, and back porting, in which it is desired to use a more modern component in a context where it is necessary to continue to use a legacy code base. To understand and illustrate both problems, we will focus on infrastructure software, i.e., software such as operating systems and language runtimes that underlie all computing. As our main motivating example, we will take the Linux kernel, which supports computing environments ranging from embedded systems to clouds and supercomputers. The Linux kernel is fast evolving and thus raises real challenges for users who need to use code designed for one version in an earlier or later one.Prior work on code porting have taken a recommendation-based approach: by observing changes between the original version and the target version, they recommend a series of method calls to replace the existing implementation of a functionality. Such approaches, however, only half address the problem: they do not help the user construct the other computations, such as tests, data structure manipulations, etc., that are essential to obtain working code. In this project, we will instead realize a history-guided source-code transformation-based approach, which automatically traverses the history of the changes made to a software system, to find where changes in the code to be ported are required, gathers examples of the required changes, and generates change rules to incrementally back port or forward port the code. We will build on existing works on automatic inference of change rules, improving their genericity and scalability, to enable comprehensively inferring and automating all of the changes required to back or forward port code between versions. Our approach will be a success if it is able to automatically back and forward port a large number of drivers for the Linux operating system to various earlier and later versions of the Linux kernel with high accuracy while requiring minimal developer effort. This objective is not achievable by existing techniques.
This project represents a 3-year collaboration between researchers at Inria (Whisper team) and at the School of Information Systems at Singapore Management University (SMU). The Inria researchers are world leaders in the design of tools for supporting the development of infrastructure software, including Coccinelle, which is regularly used today in Linux kernel development. The SMU researchers are world leaders in software mining techniques and have developed many techniques that analyze program history to automate software tasks.
The success of this project will benefit the software engineering research community, the developer, and the general public. For the software engineering research community, this project will improve the understanding of the kinds of changes that occur between versions in infrastructure software, and potentially motivate the design of new kinds of tools. For the developer, this project will ease and improve the reliability of the common task of porting between versions, freeing up resources for improving the code quality and adding new functionalities. The project will also raise awareness of how code changes impact the ability to back and forward port. For the general public, this project will help ensure that bug fixes for critical infrastructure software code are available immediately, even to users of older versions, reducing vulnerability to attacks. Our approach will also allow users running an older version of infrastructure software to benefit from support for the latest hardware and applications.
The partners on this project are:
Results on automatic transformation rule inference
Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller.
ICSE (Demo track) 2021
Lucas Serrano.
PhD thesis, Sorbonne University, France, 2020
Lucas Serrano, Van-Anh Nguyen, Ferdian Thung, Lingxiao Jiang, David Lo, Julia Lawall, Gilles Muller.
USENIX Annual Technical Conference 2020: 235-248
Stefanus A. Haryono, Ferdian Thung, Hong Jin Kang, Lucas Serrano, Gilles Muller, Julia Lawall, David Lo, Lingxiao Jiang.
ICPC (ERA track) 2020: 401-405
Ferdian Thung, Stefanus A. Haryono, Lucas Serrano, Gilles Muller, Julia Lawall, David Lo, Lingxiao Jiang.
SANER (RENE track) 2020: 602-611
Muhammad Hilmi Asyrofi, Ferdian Thung, David Lo, Lingxiao Jiang.
SANER (Tool demo) 2020: 637-641
Ferdian Thung, Hong Jin Kang, Lingxiao Jiang, David Lo.
ICSME (short paper) 2019: 213-217
Julia Lawall, Derek Palinski, Lukas Gnirke, Gilles Muller.
USENIX Annual Technical Conference 2017: 15-26
Results on semantic patches for Java
Hong Jin Kang, Ferdian Thung, Julia Lawall, Gilles Muller, Lingxiao Jiang, David Lo.
Dagstuhl Artifacts Ser. 5(2): 10:1-10:3 (2019)
Hong Jin Kang, Ferdian Thung, Julia Lawall, Gilles Muller, Lingxiao Jiang, David Lo.
ECOOP 2019: 22:1-22:27
Results on automatic detection of bug-fixing patches
Thong Hoang, Hong Jin Kang, David Lo, Julia Lawall.
ICSE 2020: 518-529
Thong Hoang, Julia Lawall, Yuan Tian, Richard Jayadi Oentaryo, David Lo.
IEEE TSE, 2019 (early access)
Thong Hoang, Julia Lawall, Richard Jayadi Oentaryo, Yuan Tian, David Lo
ICSE (Companion Volume) 2019: 83-86