Write a Blog >>
MSR 2021
Mon 17 - Wed 19 May 2021
co-located with ICSE 2021

This program is tentative and subject to change.

Tue 18 May 2021 10:16 - 10:25 at MSR Room 2 - ML and Deep Learning

Code completion is one of the main features of modern Integrated Development Environments (IDE). Its objective is to speed up code writing by predicting the next code token(s) the developer is likely to write. Research in this area has substantially bolstered the predictive performance of these techniques. However, the support to developers is still limited to the prediction of the next few tokens to type. In this work, we take a step further in this direction by presenting a large-scale empirical study aimed at exploring the capabilities of state-of-the-art deep learning (DL) models in supporting code completion at different granularity levels, including single tokens, one or multiple entire statements, up to entire code blocks (e.g., the iterated block of a for loop). To this aim, we train and test several adapted variants of the recently proposed RoBERTa model, and evaluate its predictions from several perspectives, including: (i) metrics usually adopted when assessing DL generative models (i.e., BLEU score and Levenshtein distance); (ii) the percentage of perfect predictions (i.e., the predicted code snippets that match those written by developers); and (iii) the “semantic” equivalence of the generated code as compared to the one written by developers. The achieved results show that BERT models represent a viable solution for code completion, with perfect predictions ranging from ~7%, obtained when asking the model to guess entire blocks, up to ~58%, reached in the simpler scenario of few tokens masked from the same code statement.

This program is tentative and subject to change.

Tue 18 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:00 - 10:50
10:00
8m
Talk
Fast and Memory-Efficient Neural Code Completion
Technical Papers
Alexey SvyatkovskiyMicrosoft, Sebastian LeeUniversity of Oxford, Anna HadjitofiAlan Turing Institute, Maik RiechertMicrosoft Research, Juliana Vicente FrancoMicrosoft Research, Miltiadis AllamanisMicrosoft Research, UK
Pre-print
10:08
8m
Research paper
Comparative Study of Feature Reduction Techniques in Software Change Prediction
Technical Papers
Ruchika MalhotraDelhi Technological University, Ritvik KapoorDelhi Technological University, Deepti AggarwalDelhi Technological University, Priya GargDelhi Technological University
Pre-print
10:16
8m
Talk
An Empirical Study on the Usage of BERT Models for Code Completion
Technical Papers
Matteo CiniselliUniversità della Svizzera Italiana, Nathan CooperWilliam & Mary, Luca PascarellaUniversità della Svizzera italiana, Denys PoshyvanykCollege of William & Mary, Massimiliano Di PentaUniversity of Sannio, Italy, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana
Pre-print
10:25
8m
Talk
ManyTypes4Py: A benchmark Python dataset for machine learning-based type inference
Data Showcase
Amir MirDelft University of Technology, Evaldas LatoskinasDelft University of Technology, Georgios GousiosFacebook & Delft University of Technology
10:33
8m
Talk
KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
Data Showcase
Luigi QuarantaUniversity of Bari, Italy, Fabio CalefatoUniversity of Bari, Filippo LanubileUniversity of Bari
10:41
8m
Talk
Exploring the relationship between performance metrics and cost saving potential of defect prediction models
Registered Reports
Steffen HerboldUniversity of Göttingen
Pre-print