Skip to content
S sufi.my
Back to Projects

Case Study

NeuralAV — AI Antivirus

Rust-based antivirus engine using a PyTorch fine-tuned BERT neural network for binary classification — extends my published malware-detection research into a working tool.

RustPyTorchBERTNeural NetworksBinary Analysis

Overview

NeuralAV pairs a Rust scanning engine with a PyTorch-fine-tuned BERT model to classify executables as malicious or benign directly from binary data. It’s the practical follow-up to my published book chapter on neural-network malware detection — taking the research from paper to runnable code.

Source: github.com/5uf/neuralav

Problem

  • Signature-based AV misses zero-day and polymorphic samples.
  • Heavy behavioral sandboxing is too slow for endpoint use.
  • My prior research showed BERT-style models can learn directly from binary token sequences — but only as a paper, not a tool.

Approach

Model

  • Fine-tuned a pre-trained BERT on tokenized executable bytes from labeled benign/malicious corpora.
  • Exported the trained weights for inference.

Engine (Rust)

  • Rust host reads the candidate file, tokenizes the bytes into the model’s input format, and runs inference.
  • Chose Rust for the scanning layer because memory safety matters when touching untrusted binaries, and the language gives predictable latency per scan.
  • Kept the Python training path and the Rust inference path separate — train offline, ship weights, score fast.

What It Demonstrates

  • End-to-end ML: data prep → training → deployment, not just notebook experiments.
  • Cross-language engineering: Python for training, Rust for the runtime.
  • Research-to-product translation: the same binary-data thesis from the Atlantis Press chapter, running as a tool you can point at a file.

Status

Personal research project. Model and engine live in the repo; detection accuracy is bounded by the training corpus, and the tool is a demonstration rather than a shipping AV product.