Date: February 4 at 11 am
Speaker: Daniel Gibert Llaurado – University of Lleida
Title: Going Deep into the Cat and Mouse Game: a Review of Machine Learning Approaches for Malware Detection and Classification
The fight against malware has never stopped since the dawn of computing. This fight has turned out to be a never-ending and cyclical arms race: as security analysts and researchers improve their defenses, malware developers continue to innovate, find new infection vectors and enhance their obfuscation techniques. At the dawn of the antivirus industry, signature-based methods and heuristic-based methods were sufficient to identify particular malware files. These methods require the manual creation of these rules and heuristics, via the careful selection of a representative sequence of bytes or other features indicating the presence of malicious code. However, the use of obfuscation techniques such as polymorphism and metamorphism resulted in a flow of hundreds of thousands of malicious samples being discovered every day. Consequently, previous approaches became ineffective because (1) the manual creation of rules couldn’t keep up with the huge flows of malware, and (2) they couldn’t detect new malware until analysts manually created a detection rule. Thus, new methods have to be devised to complement them and secure computer systems from attacks.Given the aforementioned circumstances, machine learning has become an appealing signature less approach to detect malware because of its ability to handle large volumes of data and to generalize to never-before-seen malware. In this presentation I will review the approaches developed during the past decade to detect and classify malware and I will present the recent trends and developments in the field with special emphasis on deep learning.
Zoom link: https://ucd-ie.zoom.us/j/67896866407?pwd=eUV2dGNvcW1HL2JZdWtuU1pKMWdTdz09
Online Recording: https://youtu.be/yorvgUhbEbI
Date: February 18 at 11 am
Speaker: Dr. Denis Shields – University College Dublin
Title: Deriving peptide research reagents and drugs from protein sequences
Computational modelling can enter into many phases of drug discovery. Peptides are short sequences of protein, a tiny proportion of which are used as clinically approved drugs. We mine the proteomes (set of entire protein sequences of an organism) of human and pathogens in order to find interesting potential peptides. While there are elements of machine learning in this work, in other respects, in particular the combination of peptide features, it has more in common with synthetic biology, and computational pipelines need to be validated by experimental pipelines. Computational modelling includes regular expression over-enrichment analysis within a rigorous statistical framework; machine learning to predict motifs and bioactivities including cell penetration; analysis of mass spectrometric identifications of peptide distributions in samples; and 3D structural modelling of peptides in relation to the known models of the target proteins that they bind to. We are currently investigating peptides with potential to stop SARS-C0V-2 from invading human cells.
Zoom link: https://ucd-ie.zoom.us/j/65061703141?pwd=Q1BEUThuVHQrTEU4Tm5ibjlUNFJPUT09
Online Recording: https://youtu.be/PIxbhrepdrI
Date: February 25 at 11 am
Speaker: Bertrand Le Saux – ESA
Title: Beyond Labels: Weakly-supervised, Continual and Semi-supervised Learning for Earth Observation
More and more data (and their corresponding meta-data) have allowed the wide adoption of automatic machine learning approaches for Earth observation. These methods, often relying on supervised learning, are designed (and succeed!) to obtain high performances on numerous and ever larger carefully prepared benchmarks. But what happens when you go in the wild? When you cannot trust the labels, or worse, when no labels exist? Domain adaptation and generalisation issues appear, leading to unpredictable results.In this talk I will present several approaches which learn beyond labels. First, I will present a weak supervision method which allows to train a neural network model when labels are inadequate or noisy. Second, I will talk about continual learning for adapting models with the help of a human-in-the-loop. Finally, I will address semi-supervised learning, or how to train models from both labelled and unlabelled data, using feed-forward networks or energy-based models. All those approaches pave the way to today’s great challenge of Earth observation: how to develop generic models able to handle the plethora of data now available?
Zoom link: https://ucd-ie.zoom.us/j/62053990310?pwd=TUo3MVBnNDZ3NU8rS2wwQjUrSE9SZz09
Online Recording: https://youtu.be/jMLNCcn-6n4