Case Study
NeuralAV — AI Antivirus
Rust-based antivirus engine using a PyTorch fine-tuned BERT neural network for binary classification — extends my published malware-detection research into a working tool.
RustPyTorchBERTNeural NetworksBinary Analysis
Overview
NeuralAV pairs a Rust scanning engine with a PyTorch-fine-tuned BERT model to classify executables as malicious or benign directly from binary data. It’s the practical follow-up to my published book chapter on neural-network malware detection — taking the research from paper to runnable code.
Source: github.com/5uf/neuralav
Problem
- Signature-based AV misses zero-day and polymorphic samples.
- Heavy behavioral sandboxing is too slow for endpoint use.
- My prior research showed BERT-style models can learn directly from binary token sequences — but only as a paper, not a tool.
Approach
Model
- Fine-tuned a pre-trained BERT on tokenized executable bytes from labeled benign/malicious corpora.
- Exported the trained weights for inference.
Engine (Rust)
- Rust host reads the candidate file, tokenizes the bytes into the model’s input format, and runs inference.
- Chose Rust for the scanning layer because memory safety matters when touching untrusted binaries, and the language gives predictable latency per scan.
- Kept the Python training path and the Rust inference path separate — train offline, ship weights, score fast.
What It Demonstrates
- End-to-end ML: data prep → training → deployment, not just notebook experiments.
- Cross-language engineering: Python for training, Rust for the runtime.
- Research-to-product translation: the same binary-data thesis from the Atlantis Press chapter, running as a tool you can point at a file.
Status
Personal research project. Model and engine live in the repo; detection accuracy is bounded by the training corpus, and the tool is a demonstration rather than a shipping AV product.