Write a Blog >>
MSR 2021
Sun 23 - Mon 24 May 2021 Location to be announced
co-located with ICSE 2021

The proliferation of open source software (OSS) development has created novel and massive software supply chains and ecosystems that require non-traditional approaches of software development. Even the approaches that fared well in the earlier stages of open source development are challenged by the sheer scale of the present open source ecosystem, the complexity of dependencies among projects, and the lack of effective means of establishing trust essential for frictionless collaboration - a cornerstone of OSS.

World of Code (WoC) is an attempt to cross-reference all OSS projects and represents over 120M git repositories from GitHub, GitLab, BitBucket, etc. with over 2B git commits (42M authors), 8B blobs, and 8.3B trees. The content of these objects is augmented via cross-referencing (a graph). For example, all commits that have created a specific blob, all repositories where a specific blob or a specific commit resides in, all commits for a specified author ID, and other maps that are impossible to compute without having an almost complete set of repositories.

Hackathons are effective ways to explore research and product ideas by teaming up with others on intense but limited in duration tasks. We propose WoC online hackathon to explore problems and solutions in open source software development that either apply at a global scale or require measurement approaches done at that scale. Examples for these are any measurements such as complete OSS activities of developers, complete downstream dependencies of a project, or the provenance of a source code file.

The event will provide activities typical of the in-person hackathon virtually. For example, defining research questions, forming teams, and scoping problems. Organizers will provide advice on the best ways to conduct data processing and improve performance. The hackathon will also provide the opportunity for participants to work with world-class researchers on relevant problems and research questions.

Please join if you are concerned about the continued health of open source software and would like to make a difference! This applies to anyone trying to support industry and educational use of open source, such as assessing risks, effectiveness and spread of tutorials, frameworks, tools, and practices. Questions related to obtaining large representative samples of data for software engineering research are equally welcome.

The descriptions of the projects selected by the PC will be published at the Hackathon track of MSR’2021. Previous WoC hackathons resulted in four publications at MSR’2020.

Any topics related to doing research, building tools, or improving infrastructure that supports global OSS development, helps industry use OSS, or educational/training aspects related to work in this giant network, are within the scope of the hackathon. For example:

  • Applications that support finding suitable code, people, projects, or bugs and/or model the social and technical networks and their evolution.

  • Applications that increase transparency by making it easier to become a contributor or that helps maintainers zero in on most relevant contributions.

  • Applications that increase understanding of software supply chains and ecosystem: how and why they function and how to manage risks, especially as related to industry use of OSS.

  • Any infrastructure work that does data fusion or data quality improvements, such as leveraging all open source data sources in WoC resource and beyond.

  • Approaches to better collect data increase the coverage or encourage outside contributions.

Key Dates

  • The intent for participation (including potential project ideas) will be collected until October 30, 2020. The intent should be submitted via email to organizers (audris@utk.edu, jdh@cs.cmu.edu, alexander.nolte@ut.ee)

  • November 14: One day online training sessions, defining research questions, scoping problems, and team formation. People who did not submit any intent on November 15 can still join at this point. During the period of Oct 30 to Nov 14 organizers will help the participants to formulate the ideas to prepare for project pitches presented on Nov 14.

  • Over the period of November 15 - December 5: multiple “hacking” days that include dedicated hackathon times and checkpoints to share, assess each team’s progress and provide support if necessary.

  • December 5: team presentations to the PC. PC will provide feedback to the presenters on the originality of the idea, the potential impact of the proposed solution, and on how to communicate the project ideas

  • January 19: Submission of a description of the team projects (up to two pages) for the submission to the MSR2021 Hackathon track.

  • February 22: Notification of the acceptance to MSR Hackathon Track published in MSR proceedings. The PC will judge submissions based on the clarity of the description, the originality of the idea, the potential impact of the proposed solution, and the sophistication of the artifacts produced during the hackathon.

Organizers will provide support in the form of mentors that can help with technical issues. The hackathon will also provide the opportunity for participants to work with world-class researchers on relevant problems and research questions.

A dedicated (issue) tracker on GitHub to answer questions and solve issues for the participants of MSR Hackathon will also be available. Slack and Zoom will be suggested as the main means of communication during the hackathon between teams, mentors and organizers. A dedicated Zoom room will be provided for each team during the entire duration of the event.