Course: Network Archaeology

Using Mathematics and Information Theory, write decoding software for malware network protocols.


Create malware protocol decoding software; using mathematics, computer science, and information theory.

What to expect

We do this by introducing several CS concepts: binary network protocols, elementary cryptanalysis, and back engineering (reverse engineering/cleanroom design. We also have an intense but occasional focus on mathematics and information theory.

People who are not deep into mathematics and computer science will still benefit from this class, but those who have a passion for theoretical CS, information theory, or mathematics, will gain the most.

The class is presented as a series of technical challenges, each teaching a concept by allowing students to invent different approaches and try them out, seeing which work well and which do not. Challenges build on each other, until, by the final challenge, students are writing custom binary protocol decoder software from scratch.

Network Archaeology is a self-paced lab class, with intermittent instructor lectures. Some participants may find the first dozen or so labs easy: they are encouraged to proceed through as quickly as they like. The instructors lead occasional “how-to” lectures, starting with the first lab, eventually bringing the class to the same point. Between lectures, instructors traverse the room helping people with labs.

This class is taught using the Linux command line. The instructor wil use command-line tools, to create increasingly powerful tools, but participants can make decent progress using Wireshark (local install) and Cyber Chef (web-based tool).

Each lab exercise either introduces new concepts or builds on previously presented concepts. Very few people make it through every exercise, and there is no expectation that anyone will “finish” in two days. Many Network Archaeology attendees come back to Cyber Fire to take this class a second or even third time.

Who should attend?

  • Computer scientists
  • Mathematicians
  • Information theorists
  • Incident investigators
  • Software engineers
  • Applied mathematicians
  • System administrators
  • Site reliability engineers

Is this the right class for me?

Network Archaeology teaches students how to approach unknown data that no existing tool can handle. People expecting to walk away with a recipe book will be disappointed. Our goal is for you to gain insight about how network protocols work, how encryption works, and what common techniques can be used to “break” malware protocols.

Network Archaeology is broadly interesting to anyone who wants a better understanding of the process of network packet forensic techniques. Even if you don’t intend to engage in this activity in your job, going through the instructor-led exercises will provide insight into challenges facing your organization.

Day 1

Topics Duration
  • Base arithmetic
  • Introduction to Network Protocols
90m
  • Byte structure of TCP/IP
  • Encoding schemes
90m
  • Examining packet captures
  • Extracting transferred data from packet captures
90m
  • Attack techniques against weak encryption
  • Helpful tools for Network Archaeology
90m

Day 2

Topics Duration
  • Entropy as it relates to cryptography
  • Application-layer protocol tunneling
90m
  • Using sequencing meta-information to reconstruct transferred information
90m
  • Analysis and decoding of novel binary protocols with no prior knowledge
90m
  • Attacking novel compression with no prior knowledge
  • Attacking novel weak cryptography with no prior knowledge
90m

Day 3

Topics Duration
  • SUNBURST Domain-Generation Algorithm
  • Base64 vs. Base32
  • Endianness issues (Esab32)
  • SUNBURST monoalphabetic substitution cipher
90m
  • Parsing of domain lists
  • First-pass decode of SUNBURST DGA
90m
  • Accurate prediction of Esab32 vs substitution
  • Decoding GUID
90m
  • Chaining Esab32 fragments
  • Dealing with errors
  • Correlating DGA domains for final decode
90m

Laptop Configuration

You will need a computer with a modern web browser, and a Linux command line. We recommend Ubuntu, either as your native OS, or in a virtual machine.

You should have the following packages pre-installed:

  • wireshark
  • tcpflow
  • tcpdump
  • python3
  • A C build toolchain:
    • apt install build-essential on Ubuntu / Debian / Mint
    • yum groupinstall 'Development Tools' on Red Hat / CentOS

We will not be able to help anyone configure their computer, so please arrive with a properly set up machine.

Other Operating Systems

If you really know what you’re doing, you can complete this class with MacOS or Windows. Be prepared to figure out your OS quirks on your own, however. Windows users should be prepared to write a lot of code, as our command-line recipes won’t work at all in Windows.