Write a Blog >>
MSR 2021
Mon 17 - Wed 19 May 2021
co-located with ICSE 2021

The proliferation of open source software (OSS) development has created novel and massive software supply chains and ecosystems that require non-traditional approaches of software development. Even the approaches that fared well in the earlier stages of open source development are challenged by the sheer scale of the present open source ecosystem, the complexity of dependencies among projects, and the lack of effective means of establishing trust essential for frictionless collaboration - a cornerstone of OSS.

World of Code (WoC) is an attempt to cross-reference all OSS projects and represents over 120M git repositories from GitHub, GitLab, BitBucket, etc. with over 2B git commits (42M authors), 8B blobs, and 8.3B trees. The content of these objects is augmented via cross-referencing (a graph). For example, all commits that have created a specific blob, all repositories where a specific blob or a specific commit resides in, all commits for a specified author ID, and other maps that are impossible to compute without having an almost complete set of repositories.

Hackathons are effective ways to explore research and product ideas by teaming up with others on intense but limited in duration tasks. We propose WoC online hackathon to explore problems and solutions in open source software development that either apply at a global scale or require measurement approaches done at that scale. Examples for these are any measurements such as complete OSS activities of developers, complete downstream dependencies of a project, or the provenance of a source code file.

The event will provide activities typical of the in-person hackathon virtually. For example, defining research questions, forming teams, and scoping problems. Organizers will provide advice on the best ways to conduct data processing and improve performance. The hackathon will also provide the opportunity for participants to work with world-class researchers on relevant problems and research questions.

Please join if you are concerned about the continued health of open source software and would like to make a difference! This applies to anyone trying to support industry and educational use of open source, such as assessing risks, effectiveness and spread of tutorials, frameworks, tools, and practices. Questions related to obtaining large representative samples of data for software engineering research are equally welcome.

The descriptions of the projects selected by the PC will be published at the Hackathon track of MSR’2021. Previous WoC hackathons resulted in four publications at MSR’2020.

Any topics related to doing research, building tools, or improving infrastructure that supports global OSS development, helps industry use OSS, or educational/training aspects related to work in this giant network, are within the scope of the hackathon. For example:

  • Applications that support finding suitable code, people, projects, or bugs and/or model the social and technical networks and their evolution.

  • Applications that increase transparency by making it easier to become a contributor or that helps maintainers zero in on most relevant contributions.

  • Applications that increase understanding of software supply chains and ecosystem: how and why they function and how to manage risks, especially as related to industry use of OSS.

  • Any infrastructure work that does data fusion or data quality improvements, such as leveraging all open source data sources in WoC resource and beyond.

  • Approaches to better collect data increase the coverage or encourage outside contributions.

Key Dates

  • The intent for participation (including potential project ideas) will be collected until October 30, 2020. The intent should be submitted via email to organizers (audris@utk.edu, jdh@cs.cmu.edu, alexander.nolte@ut.ee)

  • November 14: One day online training sessions, defining research questions, scoping problems, and team formation. People who did not submit any intent on November 15 can still join at this point. During the period of Oct 30 to Nov 14 organizers will help the participants to formulate the ideas to prepare for project pitches presented on Nov 14.

  • Over the period of November 15 - December 5: multiple “hacking” days that include dedicated hackathon times and checkpoints to share, assess each team’s progress and provide support if necessary.

  • December 5: team presentations to the PC. PC will provide feedback to the presenters on the originality of the idea, the potential impact of the proposed solution, and on how to communicate the project ideas

  • January 19: Submission of a description of the team projects (up to two pages) for the submission to the MSR2021 Hackathon track.

  • February 22: Notification of the acceptance to MSR Hackathon Track published in MSR proceedings. The PC will judge submissions based on the clarity of the description, the originality of the idea, the potential impact of the proposed solution, and the sophistication of the artifacts produced during the hackathon.

Organizers will provide support in the form of mentors that can help with technical issues. The hackathon will also provide the opportunity for participants to work with world-class researchers on relevant problems and research questions.

A dedicated (issue) tracker on GitHub to answer questions and solve issues for the participants of MSR Hackathon will also be available. Slack and Zoom will be suggested as the main means of communication during the hackathon between teams, mentors and organizers. A dedicated Zoom room will be provided for each team during the entire duration of the event.

Dates
Tracks
You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 17 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

03:10 - 04:00
Welcome Event Technical Papers / Tutorials / MIP Award / FOSS Award / content / Mining Challenge / Hackathon / MSR Awards / Registered Reports / Data Showcase / Shadow PC / Keynotes at MSR Room 1

The MSR welcoming sessions will feature informal networking opportunities for newcomers to meet each other, learn about the MSR conference series, and interact with some established MSR veterans. All are welcome!

10:00 - 10:50
Resources for MSR ResearchTechnical Papers / Data Showcase at MSR Room 1
Chair(s): Felipe Ebert Eindhoven University of Technology
10:01
3m
Talk
PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code
Technical Papers
Egor Spirin JetBrains Research; National Research University Higher School of Economics, Egor Bogomolov JetBrains Research, Vladimir Kovalenko JetBrains Research, Timofey Bryksin JetBrains Research, Saint Petersburg State University
Pre-print
10:04
3m
Talk
Mining DEV for social and technical insights about software development
Technical Papers
Maria Papoutsoglou Aristotle University of Thessaloniki, Johannes Wachs Vienna University of Economics and Business & Complexity Science Hub Vienna, Georgia Kapitsaki University of Cyprus
Pre-print
10:07
3m
Talk
TNM: A Tool for Mining of Socio-Technical Data from Git Repositories
Technical Papers
Nikolai Sviridov ITMO University, Mikhail Evtikhiev JetBrains Research, Vladimir Kovalenko JetBrains Research
Pre-print
10:10
3m
Talk
Identifying Versions of Libraries used in Stack Overflow Code Snippets
Technical Papers
Ahmed Zerouali Vrije Universiteit Brussel, Camilo Velázquez-Rodríguez Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
Pre-print Media Attached
10:13
3m
Talk
Sampling Projects in GitHub for MSR Studies
Data Showcase
Ozren Dabic Software Institute, Università della Svizzera italiana (USI), Switzerland, Emad Aghajani Software Institute, USI Università della Svizzera italiana, Gabriele Bavota Software Institute, USI Università della Svizzera italiana
Pre-print
10:16
3m
Talk
gambit – An Open Source Name Disambiguation Tool for Version Control Systems
Technical Papers
Christoph Gote Chair of Systems Design, ETH Zurich, Christian Zingg Chair of Systems Design, ETH Zurich
Pre-print Media Attached
10:19
31m
Live Q&A
Discussions and Q&A
Technical Papers

10:00 - 10:50
Testing and code reviewTechnical Papers / Data Showcase / Registered Reports at MSR Room 2
Chair(s): Jürgen Cito TU Wien and Facebook
10:01
3m
Talk
A Traceability Dataset for Open Source Systems
Data Showcase
Mouna Hammoudi JOHANNES KEPLER UNIVERSITY LINZ, Christoph Mayr-Dorn Johannes Kepler University, Linz, Atif Mashkoor Johannes Kepler University Linz, Alexander Egyed Johannes Kepler University
Media Attached
10:04
4m
Talk
How Java Programmers Test Exceptional Behavior
Technical Papers
Diego Marcilio USI Università della Svizzera italiana, Carlo A. Furia Università della Svizzera italiana (USI)
Pre-print
10:08
4m
Talk
An Exploratory Study of Log Placement Recommendation in an Enterprise System
Technical Papers
Jeanderson Cândido Delft University of Technology, Jan Haesen Adyen N.V., Maurício Aniche Delft University of Technology, Arie van Deursen Delft University of Technology, Netherlands
Pre-print Media Attached
10:12
3m
Talk
Does Code Review Promote Conformance? A Study of OpenStack Patches
Technical Papers
Panyawut Sri-iesaranusorn Nara Institute of Science and Technology, Raula Gaikovina Kula NAIST, Takashi Ishio Nara Institute of Science and Technology
Pre-print
10:15
4m
Talk
A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests
Technical Papers
Guillaume Haben University of Luxembourg, Sarra Habchi University of Luxembourg, Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print Media Attached
10:19
3m
Talk
On the Use of Mutation in Injecting Test Order-Dependency
Registered Reports
Sarra Habchi University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print Media Attached
10:22
28m
Live Q&A
Discussions and Q&A
Technical Papers

11:10 - 12:00
Welcome Event Technical Papers / Tutorials / MIP Award / FOSS Award / content / Mining Challenge / Hackathon / MSR Awards / Registered Reports / Data Showcase / Shadow PC / Keynotes at MSR Room 1

The MSR welcoming sessions will feature informal networking opportunities for newcomers to meet each other, learn about the MSR conference series, and interact with some established MSR veterans. All are welcome!

17:00 - 17:50
Mining Challenge SessionMining Challenge / Technical Papers at MSR Room 1
Chair(s): Miltiadis Allamanis Microsoft Research, UK, Rafael-Michael Karampatsis The University of Edinburgh, Charles Sutton Google Research
17:01
2m
Welcome by the Mining Challenge Co-chairs
Mining Challenge
Miltiadis Allamanis Microsoft Research, UK, Rafael-Michael Karampatsis The University of Edinburgh, Charles Sutton Google Research
17:03
3m
Talk
A large-scale study on human-cloned changes for automated program repair
Mining Challenge
Fernanda Madeiral KTH Royal Institute of Technology, Thomas Durieux KTH Royal Institute of Technology, Sweden
Link to publication Pre-print
17:06
3m
Talk
Applying CodeBERT for Automated Program Repair of Java Simple Bugs
Mining Challenge
Ehsan Mashhadi University of Calgary, Hadi Hemmati University of Calgary
Pre-print Media Attached
17:09
3m
Talk
PySStuBs: Characterizing Single-Statement Bugs in Popular Open-Source Python Projects
Mining Challenge
Arthur Veloso Kamienski University of Alberta, Luisa Palechor University of Alberta, Abram Hindle University of Alberta, Cor-Paul Bezemer University of Alberta
Pre-print
17:12
3m
Talk
How Effective is Continuous Integration in Indicating Single-Statement Bugs?
Mining Challenge
Jasmine Latendresse Concordia University, Rabe Abdalkareem Queens University, Kingston, Canada, Diego Costa Concordia University, Canada, Emad Shihab Concordia University
Pre-print
17:15
3m
Talk
Mea culpa: How developers fix their own simple bugs differently from other developers
Mining Challenge
Wenhan Zhu University of Waterloo, Michael W. Godfrey University of Waterloo, Canada
Pre-print
17:18
3m
Talk
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study
Mining Challenge
Anthony Peruma Rochester Institute of Technology, Christian D. Newman Rochester Institute of Technology
Pre-print Media Attached
17:21
3m
Talk
On the Rise and Fall of Simple Stupid Bugs: a Life-Cycle Analysis of SStuBs
Mining Challenge
Balázs Mosolygó University of Szeged, Norbert Vándor University of Szeged, Gabor Antal University of Szeged, Peter Hegedus University of Szeged
Pre-print
17:24
3m
Talk
On the Effectiveness of Deep Vulnerability Detectors to Simple Stupid Bug Detection
Mining Challenge
Jiayi Hua Beijing University of Posts and Telecommunications, Haoyu Wang Beijing University of Posts and Telecommunications
Pre-print
17:27
23m
Live Q&A
Discussions and Q&A
Technical Papers

18:10 - 19:00
Keynote: Nicole Forsgren Technical Papers at MSR Room 1

Tue 18 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50
Keynote: Leslie MileyTechnical Papers at MSR Room 1
03:10 - 04:00
Technical Debt and SmellsTechnical Papers / Data Showcase at MSR Room 1
Chair(s): Gema Rodríguez-Pérez University of Waterloo
03:11
4m
Talk
Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study
Technical Papers
Zadia Codabux University of Saskatchewan, Melina Vidoni RMIT University, Fatemeh Hendijani Fard University of British Columbia
Pre-print
03:15
3m
Talk
QScored: A Large Dataset of Code Smells and Quality Metrics
Data Showcase
Tushar Sharma Siemens Research, Marouane Kessentini University of Michigan
Pre-print
03:18
3m
Talk
Architecture Smells and Pareto Principle: A Preliminary Empirical Exploration
Technical Papers
Pre-print
03:21
4m
Talk
Self-Admitted Technical Debt in R Packages: An Exploratory Study
Technical Papers
Melina Vidoni RMIT University
Pre-print
03:25
4m
Full-paper
An Empirical Study of Developer Discussions on Low Code Software Development Challenges
Technical Papers
Md Abdullah Al Alamin University of Calgary, Sanjay Malakar Bangladesh University of Engineering and Technology, Gias Uddin University of Calgary, Canada, Sadia Afroz Bangladesh University of Engineering and Technology, Tameem Bin Haider Bangladesh University of Engineering and Technology, Anindya Iqbal Bangladesh University of Engineering and Technology Dhaka, Bangladesh
Pre-print
03:29
31m
Live Q&A
Discussions and Q&A
Technical Papers

03:10 - 04:00
Time series dataData Showcase / Technical Papers at MSR Room 2
Chair(s): Shane McIntosh University of Waterloo
03:11
3m
Talk
AndroCT: Ten Years of App Call Traces in Android
Data Showcase
Wen Li , Xiaoqin Fu Washington State University, Haipeng Cai Washington State University, USA
Pre-print Media Attached
03:14
4m
Talk
Mining Workflows for Anomalous Data Transfers
Technical Papers
Huy Tu North Carolina State University, USA, George Papadimitriou University of Southern California, Mariam Kiran ESnet, LBNL, Cong Wang Renaissance Computing Institute, Anirban Mandal Renaissance Computing Institute, Ewa Deelman University of Southern California, Tim Menzies North Carolina State University, USA
Pre-print
03:18
4m
Talk
Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data
Technical Papers
Samuel W. Flint University of Nebraska-Lincoln, Jigyasa Chauhan University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln
Pre-print Media Attached
03:22
4m
Paper
On the Naturalness and Localness of Software Logs
Technical Papers
Sina Gholamian University of Waterloo, Paul A. S. Ward University of Waterloo
Pre-print
03:26
4m
Talk
How Do Software Developers Use GitHub Actions to Automate Their Workflows?
Technical Papers
Timothy Kinsman University of Adelaide, Mairieli Wessel University of Sao Paulo, Marco Gerosa Northern Arizona University, USA, Christoph Treude University of Adelaide
Pre-print
03:30
30m
Live Q&A
Discussions and Q&A
Technical Papers

10:00 - 10:50
Developer communicationsTechnical Papers / Data Showcase at MSR Room 1
Chair(s): Hourieh Khalajzadeh Monash University, Australia
10:01
3m
Talk
Waiting around or job half-done? Sentiment in self-admitted technical debt
Technical Papers
Gianmarco Fucci University of Sannio, Nathan Cassee Eindhoven University of Technology, Fiorella Zampetti University of Sannio, Italy, Nicole Novielli University of Bari, Alexander Serebrenik Eindhoven University of Technology, Massimiliano Di Penta University of Sannio, Italy
Pre-print Media Attached
10:04
4m
Research paper
Automatically Selecting Follow-up Questions for Deficient Bug Reports
Technical Papers
Mia Mohammad Imran Virginia Commonwealth University, Agnieszka Ciborowska Virginia Commonwealth University, Kostadin Damevski Virginia Commonwealth University
Pre-print
10:08
4m
Talk
Challenges in Developing Desktop Web Apps: a Study of Stack Overflow and GitHub
Technical Papers
Gian Luca Scoccia University of L'Aquila, Patrizio Migliarini DISIM, University of L'Aquila, Marco Autili University of L'Aquila, Italy
Pre-print
10:12
3m
Talk
Search4Code: Code Search Intent Classification Using Weak Supervision
Data Showcase
Nikitha Rao Microsoft Research, Chetan Bansal Microsoft Research, Joe Guan Microsoft
Pre-print
10:15
35m
Live Q&A
Discussions and Q&A
Technical Papers

10:00 - 10:50
ML and Deep LearningTechnical Papers / Data Showcase / Registered Reports at MSR Room 2
Chair(s): Hongyu Zhang The University of Newcastle
10:01
4m
Talk
Fast and Memory-Efficient Neural Code Completion
Technical Papers
Alexey Svyatkovskiy Microsoft, Sebastian Lee University of Oxford, Anna Hadjitofi Alan Turing Institute, Maik Riechert Microsoft Research, Juliana Franco Microsoft Research, Miltiadis Allamanis Microsoft Research, UK
Pre-print Media Attached
10:05
4m
Research paper
Comparative Study of Feature Reduction Techniques in Software Change Prediction
Technical Papers
Ruchika Malhotra Delhi Technological University, Ritvik Kapoor Delhi Technological University, Deepti Aggarwal Delhi Technological University, Priya Garg Delhi Technological University
Pre-print
10:09
4m
Talk
An Empirical Study on the Usage of BERT Models for Code Completion
Technical Papers
Matteo Ciniselli Università della Svizzera Italiana, Nathan Cooper William & Mary, Luca Pascarella Delft University of Technology, Denys Poshyvanyk College of William & Mary, Massimiliano Di Penta University of Sannio, Italy, Gabriele Bavota Software Institute, USI Università della Svizzera italiana
Pre-print
10:13
3m
Talk
ManyTypes4Py: A benchmark Python dataset for machine learning-based type inference
Data Showcase
Amir Mir Delft University of Technology, Evaldas Latoskinas Delft University of Technology, Georgios Gousios Facebook & Delft University of Technology
Pre-print
10:16
3m
Talk
KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
Data Showcase
Luigi Quaranta University of Bari, Italy, Fabio Calefato University of Bari, Filippo Lanubile University of Bari
10:19
3m
Talk
Exploring the relationship between performance metrics and cost saving potential of defect prediction models
Registered Reports
Steffen Herbold University of Göttingen
Pre-print
10:22
28m
Live Q&A
Discussions and Q&A
Technical Papers

11:10 - 12:00
11:10
50m
Tutorial
PyDriller 1.0 -- Ready to grow together
Tutorials
Alberto Bacchelli University of Zurich, Maurício Aniche Delft University of Technology
Pre-print
17:00 - 17:50
HackathonTechnical Papers / Hackathon at MSR Room 1
Chair(s): Jim Herbsleb Carnegie Mellon University, Audris Mockus The University of Tennessee, Alexander Nolte University of Tartu
17:01
2m
Welcome by the MSR Hackathon Co-Chairs
Hackathon
Jim Herbsleb Carnegie Mellon University, Audris Mockus The University of Tennessee, Alexander Nolte University of Tartu
17:03
3m
Talk
An Exploratory Study of Project Activity Changepoints in Open Source Software Evolution
Hackathon
James Walden Northern Kentucky University, Noah Burgin, Kuljit Kaur Chahal Kaur
17:06
3m
Paper
The Diversity-Innovation Paradox in Open-Source Software
Hackathon
Mengchen Sam Yong Carnegie Mellon University, Pittsburgh, Pennsylvania, United States, Lavinia Francesca Paganini Federal University of Pernambuco, Huilian Sophie Qiu Carnegie Mellon University, Pittsburgh, Pennsylvania, United States, José Bayoán Santiago Calderón University of Virginia, USA
DOI Pre-print
17:09
4m
Talk
The Secret Life of Hackathon Code
Technical Papers
Ahmed Samir Imam Mahmoud University of Tartu, Tapajit Dey Lero - The Irish Software Research Centre and University of Limerick, Alexander Nolte University of Tartu, Audris Mockus The University of Tennessee, Jim Herbsleb Carnegie Mellon University
Pre-print
17:13
3m
Talk
Tracing Vulnerable Code Lineage
Hackathon
David Reid University of Tennessee, Kalvin Eng University of Alberta, Chris Bogart Carnegie Mellon University, Adam Tutko University of Tennessee - Knoxville
Pre-print
17:16
3m
Talk
Building the Collaboration Graph of Open-Source Software Ecosystem
Hackathon
Elena Lyulina JetBrains Research, Mahmoud Jahanshahi
Pre-print
17:19
1m
Talk
The Secret Life of Hackathon Code
Hackathon
Ahmed Samir Imam Mahmoud University of Tartu, Tapajit Dey Lero - The Irish Software Research Centre and University of Limerick
Pre-print
17:20
30m
Live Q&A
Discussions and Q&A
Technical Papers

17:00 - 17:50
TestingTechnical Papers / Data Showcase at MSR Room 2
Chair(s): Abram Hindle University of Alberta
17:01
4m
Talk
What Code Is Deliberately Excluded from Test Coverage and Why?
Technical Papers
Pre-print Media Attached
17:05
3m
Talk
AndroR2: A Dataset of Manually-Reproduced Bug Reports for Android apps
Data Showcase
Tyler Wendland University of Minnesota, Jingyang Sun University of Bristish Columbia, Junayed Mahmud George Mason University, S M Hasan Mansur George Mason University, Steven Huang University of Bristish Columbia, Kevin Moran George Mason University, Julia Rubin University of British Columbia, Canada, Mattia Fazzini University of Minnesota
17:08
3m
Talk
Apache Software Foundation Incubator Project Sustainability Dataset
Data Showcase
Likang Yin University of California, Davis, Zhiyuan Zhang University of California, Davis, Qi Xuan Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China, Vladimir Filkov University of California at Davis, USA
17:11
4m
Talk
Leveraging Models to Reduce Test Cases in Software Repositories
Technical Papers
Golnaz Gharachorlu Simon Fraser University, Nick Sumner Simon Fraser University
Pre-print Media Attached
17:15
4m
Talk
Which contributions count? Analysis of attribution in open source
Technical Papers
Jean-Gabriel Young University of Vermont, amanda casari Open Source Programs Office, Google, Katie McLaughlin Open Source Programs Office, Google, Milo Trujillo University of Vermont, Laurent Hébert-Dufresne University of Vermont, James P. Bagrow University of Vermont
Pre-print Media Attached
17:19
4m
Talk
On Improving Deep Learning Trace Analysis with System Call Arguments
Technical Papers
Quentin Fournier Polytechnique Montréal, Daniel Aloise Polytechnique Montréal, Seyed Vahid Azhari Ciena, François Tetreault Ciena
Pre-print
17:23
27m
Live Q&A
Discussions and Q&A
Technical Papers

Wed 19 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:00 - 02:50
02:01
4m
Talk
Practitioners' Perceptions of the Goals and Visual Explanations of Defect Prediction Models
Technical Papers
Jirayus Jiarpakdee Monash University, Australia, Kla Tantithamthavorn Monash University, John Grundy Monash University
Pre-print
02:05
3m
Talk
On the Effectiveness of Deep Vulnerability Detectors to Simple Stupid Bug Detection
Mining Challenge
Jiayi Hua Beijing University of Posts and Telecommunications, Haoyu Wang Beijing University of Posts and Telecommunications
Pre-print
02:08
4m
Talk
An Empirical Study of OSS-Fuzz Bugs
Technical Papers
Zhen Yu Ding Motional, Claire Le Goues Carnegie Mellon University
Pre-print
02:12
3m
Talk
Denchmark: A Bug Benchmark of Deep Learning-related Software
Data Showcase
Misoo Kim Sungkyunkwan University, Youngkyoung Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University
02:15
4m
Talk
JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction
Technical Papers
Chanathip Pornprasit Monash University, Kla Tantithamthavorn Monash University
Pre-print
02:19
31m
Live Q&A
Discussions and Q&A
Technical Papers

02:00 - 02:50
NLPRegistered Reports / Technical Papers at MSR Room 2
Chair(s): Chunyang Chen Monash University
02:01
4m
Talk
Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions
Technical Papers
Sofonias Yitagesu Tianjin University, Xiaowang Zhang Tianjin University, Zhiyong Feng Tianjin University, Xiaohong Li TianJin University, Zhenchang Xing Australian National University
Pre-print
02:05
4m
Talk
Attention-based model for predicting question relatedness on Stack Overflow
Technical Papers
Jiayan Pei South China University of Technology, Yimin Wu South China University of Technology, Research Institute of SCUT in Yangjiang, Zishan Qin South China University of Technology, Yao Cong South China University of Technology, Jingtao Guan Research Institute of SCUT in Yangjiang
Pre-print
02:09
4m
Talk
Characterising the Knowledge about Primitive Variables in Java Code Comments
Technical Papers
Mahfouth Alghamdi The University of Adelaide, Shinpei Hayashi Tokyo Institute of Technology, Takashi Kobayashi Tokyo Institute of Technology, Christoph Treude University of Adelaide
Pre-print
02:13
4m
Talk
Googling for Software Development: What Developers Search For and What They Find
Technical Papers
Pre-print Media Attached
02:17
3m
Talk
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
Registered Reports
Mohammad Abdul Hadi University of British Columbia, Fatemeh Hendijani Fard University of British Columbia
Pre-print
02:20
3m
Talk
Cross-status Communication and Project Outcomes in OSS Development–A Language Style Matching Perspective
Registered Reports
Yisi Han Nanjing University, Zhendong Wang University of California, Irvine, Yang Feng State Key Laboratory for Novel Software Technology, Nanjing University, Zhihong Zhao Nanjing Tech Unniversity, Yi Wang Beijing University of Posts and Telecommunications
Pre-print
02:23
27m
Live Q&A
Discussions and Q&A
Technical Papers

03:10 - 04:00
03:10
50m
Tutorial
Elasticsearch Full-Text Search Internals
Tutorials
10:00 - 10:50
DatasetsData Showcase / Technical Papers at MSR Room 1
Chair(s): Sridhar Chimalakonda Indian Institute of Technology Tirupati
10:01
3m
Talk
AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories
Data Showcase
Sebastian Nielebock Otto-von-Guericke University Magdeburg, Germany, Paul Blockhaus Otto-von-Guericke-University Magdeburg, Germany, Jacob Krüger Otto von Guericke University Magdeburg, Frank Ortmeier Otto-von-Guericke-University Magdeburg, Faculty of Computer Science, Chair of Software Engineering
Pre-print Media Attached
10:04
3m
Talk
GE526: A Dataset of Open Source Game Engines
Data Showcase
Dheeraj Vagavolu Indian Institute of Technology Tirupati, Vartika Agrahari Indian Institute of Technology Tirupati, Sridhar Chimalakonda Indian Institute of Technology Tirupati, Akhila Sri Manasa Venigalla IIT Tirupati, India
10:07
3m
Talk
Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution
Data Showcase
Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
10:10
3m
Talk
The Wonderless Dataset for Serverless Computing
Data Showcase
Nafise Eskandani TU Darmstadt, Guido Salvaneschi University of St. Gallen
Pre-print
10:13
3m
Talk
DUETS: A Dataset of Reproducible Pairs of Java Library-Clients
Data Showcase
Thomas Durieux KTH Royal Institute of Technology, Sweden, César Soto-Valero KTH Royal Institute of Technology, Benoit Baudry KTH Royal Institute of Technology
Pre-print
10:16
3m
Talk
EQBENCH: A Dataset of Equivalent and Non-equivalent Program Pairs
Data Showcase
Sahar Badihi University of British Columbia, Canada, Yi Li Nanyang Technological University, Julia Rubin University of British Columbia, Canada
10:19
31m
Live Q&A
Discussions and Q&A
Technical Papers

10:00 - 10:50
Dependencies and OSSTechnical Papers / Registered Reports at MSR Room 2
Chair(s): Luca Pascarella Delft University of Technology
10:01
3m
Talk
Identifying Critical Projects via PageRank and Truck Factor
Technical Papers
Rolf-Helge Pfeiffer IT University of Copenhagen
Pre-print
10:04
4m
Talk
Revisiting Dockerfiles in Open Source Software Over Time
Technical Papers
Kalvin Eng University of Alberta, Abram Hindle University of Alberta
Pre-print
10:08
3m
Talk
Does the First-Response Matter for Future Contributions? A Study of First Contributions
Registered Reports
Noppadol Assavakamhaenghan Nara Institute of Science and Technology, Supatsara Wattanakriengkrai Nara Institute of Science and Technology, Naomichi Shimada Nara Institute of Science and Technology, Raula Gaikovina Kula NAIST, Takashi Ishio Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print
10:11
4m
Talk
Data Balancing Improves Self-Admitted Technical Debt Detection
Technical Papers
Murali Sridharan University of Oulu, Leevi Rantala University of Oulu, Maëlick Claes University of Oulu, Mika Mäntylä University of Oulu
Pre-print
10:15
35m
Live Q&A
Discussions and Q&A
Technical Papers

17:00 - 17:50
Energy, logging, and APIsTechnical Papers at MSR Room 1
Chair(s): Akond Rahman Tennessee Tech University
17:01
3m
Talk
S3M: Siamese Stack (Trace) Similarity Measure
Technical Papers
Aleksandr Khvorov JetBrains, ITMO University, Roman Vasiliev JetBrains, George Chernishev Saint-Petersburg State University, Irving Muller Rodrigues Polytechnique Montreal, Montreal, Canada, Dmitrij Koznov Saint-Petersburg State University, Nikita Povarov JetBrains
Pre-print
17:04
4m
Talk
Mining the ROS ecosystem for Green Architectural Tactics in Robotics and an Empirical Evaluation
Technical Papers
Ivano Malavolta Vrije Universiteit Amsterdam, Katerina Chinnappan Vrije Universiteit Amsterdam, Stan Swanborn Vrije Universiteit Amsterdam, The Netherlands, Grace Lewis Carnegie Mellon Software Engineering Institute, Patricia Lago Vrije Universiteit Amsterdam
Pre-print Media Attached
17:08
4m
Talk
Mining Energy-Related Practices in Robotics Software
Technical Papers
Michel Albonico UTFPR, Ivano Malavolta Vrije Universiteit Amsterdam, Gustavo Pinto Federal University of Pará, Emitzá Guzmán Vrije Universiteit Amsterdam, Katerina Chinnappan Vrije Universiteit Amsterdam, Patricia Lago Vrije Universiteit Amsterdam
Pre-print Media Attached
17:12
3m
Talk
Mining API Interactions to Analyze Software Revisions for the Evolution of Energy Consumption
Technical Papers
Andreas Schuler University of Applied Sciences Upper Austria, Gabriele Anderst-Kotsis Johannes Kepler University, Linz, Austria
Pre-print
17:15
4m
Talk
Can I Solve it? Identifying the APIs required to complete OSS tasks
Technical Papers
Fabio Marcos De Abreu Santos Northern Arizona University, USA, Igor Scaliante Wiese Federal University of Technology – Paraná - UTFPR, Bianca Trinkenreich Northern of Arizona Univeristy, Igor Steinmacher Northern Arizona University, USA, Anita Sarma Oregon State University, Marco Gerosa Northern Arizona University, USA
Pre-print
17:19
31m
Live Q&A
Discussions and Q&A
Technical Papers

17:00 - 17:50
Change Management and AnalysisTechnical Papers / Registered Reports at MSR Room 2
Chair(s): Sarah Nadi University of Alberta
17:01
4m
Talk
Studying the Change Histories of Stack Overflow and GitHub Snippets
Technical Papers
Saraj Singh Manes Carleton University, Olga Baysal Carleton University
Pre-print Media Attached
17:05
4m
Talk
Learning Off-By-One Mistakes: An Empirical Study
Technical Papers
Hendrig Sellik Delft University of Technology, Onno van Paridon Adyen N.V., Georgios Gousios Facebook & Delft University of Technology, Maurício Aniche Delft University of Technology
Pre-print
17:09
4m
Talk
Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study
Technical Papers
Anderson Uchôa Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Caio Barbosa Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Daniel Coutinho Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Willian Oizumi Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Wesley Assunção Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Silvia Regina Vergilio Federal University of Paraná, Juliana Alves Pereira PUC-Rio, Anderson Oliveira PUC-Rio, Alessandro Garcia PUC-Rio
Pre-print
17:13
4m
Talk
Rollback Edit Inconsistencies in Developer Forum
Technical Papers
Saikat Mondal University of Saskatchewan, Gias Uddin University of Calgary, Canada, Chanchal K. Roy University of Saskatchewan
Pre-print
17:17
3m
Talk
Assessing the Exposure of Software Changes: The DiPiDi Approach
Registered Reports
Mehran Meidani University of Waterloo, Maxime Lamothe University of Waterloo, Shane McIntosh
Pre-print
17:20
4m
Talk
On the Use of Dependabot Security Pull Requests
Technical Papers
Mahmoud Alfadel Concordia Univerisity, Diego Costa Concordia University, Canada, Emad Shihab Concordia University, Mouafak Mkhallalati Concordia University
Pre-print
17:24
26m
Live Q&A
Discussions and Q&A
Technical Papers

Call for Papers

Please join if you are concerned about the continued health of open source software and would like to make a difference! This applies to anyone trying to support industry and educational use of open source, such as assessing risks, effectiveness and spread of tutorials, frameworks, tools, and practices. Questions related to obtaining large representative samples of data for software engineering research are equally welcome.

The descriptions of the projects selected by the PC will be published at the Hackathon track of MSR’2021. Previous WoC hackathons resulted in four publications at MSR’2020.

Any topics related to doing research, building tools, or improving infrastructure that supports global OSS development, helps industry use OSS, or educational/training aspects related to work in this giant network, are within the scope of the hackathon. For example:

  • Applications that support finding suitable code, people, projects, or bugs and/or model the social and technical networks and their evolution.

  • Applications that increase transparency by making it easier to become a contributor or that helps maintainers zero in on most relevant contributions.

  • Applications that increase understanding of software supply chains and ecosystem: how and why they function and how to manage risks, especially as related to industry use of OSS.

  • Any infrastructure work that does data fusion or data quality improvements, such as leveraging all open source data sources in WoC resource and beyond.

  • Approaches to better collect data increase the coverage or encourage outside contributions.

Key Dates

  • The intent for participation (including potential project ideas) will be collected until October 30, 2020. The intent should be submitted via email to organizers (audris@utk.edu, jdh@cs.cmu.edu, alexander.nolte@ut.ee)

  • November 14: One day online training sessions, defining research questions, scoping problems, and team formation. People who did not submit any intent on November 15 can still join at this point. During the period of Oct 30 to Nov 14 organizers will help the participants to formulate the ideas to prepare for project pitches presented on Nov 14.

  • Over the period of November 15 - December 5: multiple “hacking” days that include dedicated hackathon times and checkpoints to share, assess each team’s progress and provide support if necessary.

  • December 5: team presentations to the PC. PC will provide feedback to the presenters on the originality of the idea, the potential impact of the proposed solution, and on how to communicate the project ideas

  • January 19: Submission of a description of the team projects (up to two pages) for the submission to the MSR2021 Hackathon track.

  • February 22: Notification of the acceptance to MSR Hackathon Track published in MSR proceedings. The PC will judge submissions based on the clarity of the description, the originality of the idea, the potential impact of the proposed solution, and the sophistication of the artifacts produced during the hackathon.

Organizers will provide support in the form of mentors that can help with technical issues. The hackathon will also provide the opportunity for participants to work with world-class researchers on relevant problems and research questions.

A dedicated (issue) tracker on GitHub to answer questions and solve issues for the participants of MSR Hackathon will also be available. Slack and Zoom will be suggested as the main means of communication during the hackathon between teams, mentors and organizers. A dedicated Zoom room will be provided for each team during the entire duration of the event.