On Improving Deep Learning Trace Analysis with System Call Arguments (MSR 2021 - Technical Papers)

Who

Quentin Fournier, Daniel Aloise, Seyed Vahid Azhari, François Tetreault

Track

MSR 2021 Technical Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 18 May 2021 17:19 - 17:23 at MSR Room 2 - Testing Chair(s): Abram Hindle

Abstract

Kernel traces are sequences of low-level events comprising a name and multiple arguments including a timestamp, a process id, and a return value, depending on the event. Their analysis helps uncover intrusions, identify bugs, and find latency causes. However, their effectiveness is hindered by omitting the event arguments. To remedy this limitation, we introduce a general approach to learn a representation of the event names along with their arguments using both embedding and encoding. The proposed method is readily applicable to most neural networks and is task-agnostic. The benefit is quantified by conducting an ablation study on three groups of arguments: call-related, process-related, and time-related. Experiments were conducted on a novel web request dataset and validated on a second dataset collected on pre-production servers by Ciena. By leveraging additional information, we were able to increase the performance of two widely-used neural networks, an LSTM and a Transformer, by up to 11.3% on two unsupervised language modelling tasks. Such tasks may be used to detect anomalies, pre-train neural networks to improve their performance, and extract a contextual representation of the events.

Link to Preprint

https://arxiv.org/abs/2103.06915

Quentin Fournier

Polytechnique Montréal

Canada

Daniel Aloise

Polytechnique Montréal

Seyed Vahid Azhari

Ciena

François Tetreault

Ciena

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 18 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

17:00 - 17:50	TestingTechnical Papers / Data Showcase at MSR Room 2 Chair(s): Abram Hindle University of Alberta

17:01 4m Talk		What Code Is Deliberately Excluded from Test Coverage and Why? Technical Papers Andre Hora UFMG Pre-print Media Attached
17:05 3m Talk		AndroR2: A Dataset of Manually-Reproduced Bug Reports for Android apps Data Showcase Tyler Wendland University of Minnesota, Jingyang Sun University of Bristish Columbia, Junayed Mahmud George Mason University, S M Hasan Mansur George Mason University, Steven Huang University of Bristish Columbia, Kevin Moran George Mason University, Julia Rubin University of British Columbia, Canada, Mattia Fazzini University of Minnesota
17:08 3m Talk		Apache Software Foundation Incubator Project Sustainability Dataset Data Showcase Likang Yin University of California, Davis, Zhiyuan Zhang University of California, Davis, Qi Xuan Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China, Vladimir Filkov University of California at Davis, USA
17:11 4m Talk		Leveraging Models to Reduce Test Cases in Software Repositories Technical Papers Golnaz Gharachorlu Simon Fraser University, Nick Sumner Simon Fraser University Pre-print Media Attached
17:15 4m Talk		Which contributions count? Analysis of attribution in open source Technical Papers Jean-Gabriel Young University of Vermont, amanda casari Open Source Programs Office, Google, Katie McLaughlin Open Source Programs Office, Google, Milo Trujillo University of Vermont, Laurent Hébert-Dufresne University of Vermont, James P. Bagrow University of Vermont Pre-print Media Attached
17:19 4m Talk		On Improving Deep Learning Trace Analysis with System Call Arguments Technical Papers Quentin Fournier Polytechnique Montréal, Daniel Aloise Polytechnique Montréal, Seyed Vahid Azhari Ciena, François Tetreault Ciena Pre-print
17:23 27m Live Q&A		Discussions and Q&A Technical Papers

Information for Participants

Tue 18 May 2021 17:00 - 17:50 at MSR Room 2 - Testing Chair(s): Abram Hindle

Info for room MSR Room 2:

Go directly to this room on Clowdr