Bild
Illustration from Khaled Al Sabbagh's PhD thesis
Foto: Khaled Al Sabbagh
Länkstig

Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise

Naturvetenskap & IT

Khaled Al Sabbagh disputerar i ämnet data- och informationsteknik med avhandlingen "Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise".

Disputation
Datum
18 sep 2023
Tid
13:00 - 16:00
Plats
Rum Tesla, Lindholmen Science Park, Lindholmspiren 5, Göteborg

Arrangör
Institutionen för data- och informationsteknik

Sammanfattning:

The availability of large amounts of data in Continuous Integration (CI) systems allows companies to utilize machine learning (ML) methods to optimize CI processes. The predictive performance of these methods can be hindered by noise in code change data. Using design science research and controlled experiments, this thesis examines the impact of noisehandling techniques in CI. Two ML-based methods, MeBoTS and HiTTs, are developed for regression testing. A taxonomy and a class noisehandling approach (DB) ae created to reduce class noise. Controlled experiments are conducted to examine the effect of class noise-handling on MeBoTS’ performance. The results show that handling class noise using DB improves test case selection and code change request predictions. Further, memory management and complexity code changes should be tested with performance-related tests. The “majority filter” algorithm is the most effective in improving the prediction of build outcomes and code change requests.

This thesis highlights the importance of handling class noise in code change data to improve test case selection, build outcomes, and change request predictions. It also shows that using code-to-test dependencies offers an effective way to perform regression testing. Finally, it shows that software engineers do not necessarily need to remove attribute noise to gain improvements in test selection.

Till fulltextversion av avhandlingen

Fakultetsopponent:

Professor Burak Turhan, Faculty of Information Technology and Electrical Engineering, University of Oulu, Finland

Betygsnämnd:

  • Professor Natalia Juristo, Facultad de Informática, Universidad Politécnica de Madrid, Spanien
  • Professor Darja Smite, Institutionen för programvaruteknik, Blekinge tekniska högskola
  • Universitetslektor Markus Borg, Institutionen för datavetenskap, Lund University