Apache Software Foundation Incubator Project Sustainability Dataset
Open Source Software success and sustainability is critically important for the digital infrastructure as OSS is used broadly and yet 83+% of such projects fail. To increase chances of success many projects join established software communities, e.g. the Apache Software Foundation (ASF), with clearly established rules and support. Specifically at ASF, projects that strive to join ASF and are at a nascent development stage are digitally housed in the ASF incubator (ASFI), which provides a mature governance environment and expert help toward long-term sustainability. Projects in ASFI eventually conclude their incubation by graduating, if successful on the path to sustainability. Otherwise, they get retired. In ASF, digital traces of developer activities for projects in ASFI are publicly available, together with monthly project status.
Here we present a longitudinal dataset of developer coding and communication activities of 269 projects from the Apache Software Foundation Incubator (ASFI). Each project in ASFI is evaluated while in incubation and is eventually ‘‘graduated’’ or ‘‘retired’’, a label indicating the project sustainability promise with respect to their technical development and community diversity. This extrinsically labeled dataset offers heretofore unavailable sustainability data of OSS project development under ASF regulations and governance. We hope its availability will foster more research interest in studying sustainability in OSS projects.