Case Study

NeuralAV — AI Antivirus

Rust-based antivirus engine using a PyTorch fine-tuned BERT neural network for binary classification — extends my published malware-detection research into a working tool.

RustPyTorchBERTNeural NetworksBinary Analysis

Overview

NeuralAV pairs a Rust scanning engine with a PyTorch-fine-tuned BERT model to classify executables as malicious or benign directly from binary data. It’s the practical follow-up to my published book chapter on neural-network malware detection — taking the research from paper to runnable code.

Source: github.com/5uf/neuralav

Problem

Signature-based AV misses zero-day and polymorphic samples.
Heavy behavioral sandboxing is too slow for endpoint use.
My prior research showed BERT-style models can learn directly from binary token sequences — but only as a paper, not a tool.

Approach

Model

Fine-tuned a pre-trained BERT on tokenized executable bytes from labeled benign/malicious corpora.
Exported the trained weights for inference.

Engine (Rust)

Rust host reads the candidate file, tokenizes the bytes into the model’s input format, and runs inference.
Chose Rust for the scanning layer because memory safety matters when touching untrusted binaries, and the language gives predictable latency per scan.
Kept the Python training path and the Rust inference path separate — train offline, ship weights, score fast.

What It Demonstrates

End-to-end ML: data prep → training → deployment, not just notebook experiments.
Cross-language engineering: Python for training, Rust for the runtime.
Research-to-product translation: the same binary-data thesis from the Atlantis Press chapter, running as a tool you can point at a file.

Status

Personal research project. Model and engine live in the repo; detection accuracy is bounded by the training corpus, and the tool is a demonstration rather than a shipping AV product.