Autonomous Cyber Analytics and Anomaly Detection
Machine Learning (ML) holds a pivotal position in the realm of cyber defense, particularly given the expanding scale of networks, the proliferation of software and malware, and the deluge of data they generate. One of the paramount challenges faced by cyber defenders is the ability to differentiate between malicious anomalies and benign yet uncommon activities. This task has taken on heightened significance as the attack surfaces within large enterprise networks continue to expand.
In this context, anomaly detection systems grounded in statistical and large-scale analysis/modeling of user and device behavior have emerged as indispensable tools for identifying and mitigating malicious activities. The overarching objective of this project is to pioneer innovative ML modeling techniques, focusing on specific facets or attributes of netflow and/or host activity, as well as software/malware data. The ultimate aim is to gain a deeper understanding of how users and computers typically operate within these networks. By the culmination of this endeavor, the developed tool will significantly enhance network awareness, paving the way for the integration of state-of-the-art autonomous anomaly and intrusion detection capabilities.
Selected Past Student Projects:
- Semi-supervised Classification of Malware Families Under Extreme Class Imbalance via Hierarchical Non-Negative Matrix Factorization with Automatic Model Selection [Paper]
- MalwareDNA: Simultaneous Classification of Malware, Malware Families, and Novel Malware [Paper]
- General-Purpose Unsupervised Cyber Anomaly Detection via Non-Negative Tensor Factorization [Paper]
- Electrical Grid Anomaly Detection via Tensor Decomposition [Paper]
- Detecting Anomalies using Overlapping Electrical Measurements in Smart Power Grids [Paper]
- pyCP_APR, a Python library for tensor decomposition and anomaly detection that is developed as part of the R&D 100 award winning SmartTensors project [Code]
- Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization [Paper]
- Graph link prediction in computer networks using Poisson matrix factorisation [Paper]
- Hyperspectral Anomaly Detection using Neural Networks
Desired Qualifications:
Projects can span a spectrum from statistical and machine learning research to tool development, tailored to align with the candidate's unique skill sets and interests.
- Strong programming and software development experience in Python or R.
- Profound proficiency in Machine Learning (ML) concepts and the entire ML pipeline, including exploratory data analysis, data pre-processing, and the development of intelligent models. Candidates with experience in Python, for example, demonstrated experience in ML pipeline development showcasing adeptness with key Python packages such as Pandas, PyTorch, TensorFlow, Matplotlib, Numpy, and CuPy.
- Highly skilled in navigating Linux operating systems, with advanced proficiency in terminal usage.
- Background in large-scale data analysis pipelines such as distributed and parallel data processing. Example Python packages include mpi4py and Joblib.
- Proficient in both hardware and software aspects of cybersecurity principles.
- Willingness and ambition to learn advanced research-level topics at a fast pace and immediately contribute to the research and national security communities.
- Strong communication and written skills.
- (Bonus) Background in Statistics.
- (Bonus) Experience in using High Performance Computing (HPC) systems.
- (Bonus) Experience in writing scientific papers in LaTeX.
- (Bonus) Experience in malware reverse engineering.
Your Role in the Project:
- Creating innovative concepts and crafting solutions for anomaly detection, tailored to address challenges within the realm of national security.
- Develop, test, and benchmark proof of concept code.
- Develop ML and data analysis pipelines for large-scale data utilizing high performance computing resources and emerging computing technologies.
- Creating well-designed presentations that are informative to a wide audience and succinctly communicate technological value propositions.
- Performing extensive literature reviews of cutting-edge theoretical and applied anomaly detection work.
- Write a paper and/or compile a poster publication at the end of the project.
LA-UR-23-30455