Write a Blog >>
MSR 2021
Mon 17 - Wed 19 May 2021
co-located with ICSE 2021

This program is tentative and subject to change.

Tue 18 May 2021 03:10 - 03:20 at MSR Room 2 - Time series data

Data-driven approaches have proven to be promising in mobile software analysis, yet these approaches rely on sizable and quality datasets. For Android app analysis in particular, there have been several well-known datasets that are widely used by the community. However, there is still a lack of such datasets that represent the run-time behaviors of apps— existing datasets are largely static, whereas run-time datasets are essential for data-driven dynamic and hybrid analysis of apps. In this paper, we present AndroCT, a large-scale dataset on the run-time traces of function calls in 35,974 benign and malicious Android apps from ten historical years (2010 through 2019). These call traces were produced by running each sample app against automatically generated test inputs for ten minutes. Moreover, each app was exercised both on an emulator and a real device, and the traces were separately curated. AndroCT has been used to build a novel dynamic profile of Android apps that has enabled several effective techniques and informative empirical studies concerning Android app security. We describe what this dataset includes, how it was created and stored, and how it has been used in past and would be used in the future.

This program is tentative and subject to change.

Tue 18 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

03:10 - 04:00
03:10
10m
Talk
AndroCT: Ten Years of App Call Traces in Android
Data Showcase
Wen Li, Xiaoqin FuWashington State University, Haipeng CaiWashington State University, USA
Pre-print Media Attached
03:20
10m
Talk
Mining Workflows for Anomalous Data Transfers
Technical Papers
Huy TuNorth Carolina State University, USA, George PapadimitriouUniversity of Southern California, Mariam KiranESnet, LBNL, Cong WangRenaissance Computing Institute, Anirban MandalRenaissance Computing Institute, Ewa DeelmanUniversity of Southern California, Tim MenziesNorth Carolina State University, USA
Pre-print
03:30
10m
Talk
Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data
Technical Papers
Samuel W. FlintUniversity of Nebraska-Lincoln, Jigyasa ChauhanUniversity of Nebraska-Lincoln, Robert DyerUniversity of Nebraska - Lincoln
Pre-print
03:40
10m
Paper
On the Naturalness and Localness of Software Logs
Technical Papers
Sina GholamianUniversity of Waterloo, Paul A. S. WardUniversity of Waterloo
Pre-print
03:50
10m
Talk
How Do Software Developers Use GitHub Actions to Automate Their Workflows?
Technical Papers
Timothy KinsmanUniversity of Adelaide, Mairieli WesselUniversity of Sao Paulo, Marco GerosaNorthern Arizona University, USA, Christoph TreudeUniversity of Adelaide
Pre-print