Research Papers Simplified -- Sufi Afifi

Research Feed

Latest papers, simplified

Peer-reviewed papers on AI, security, and software engineering. Abstracts trimmed to the key insight.

NLP / LLMs Jul 9, 2026

Accurate, Interdisciplinary and Transparent Structure-property Understanding with Deep Native Structural Reasoning

Chen Tang, Yizhou Wang, Jianyu Wu et al.

Structure-property relationships are foundational to biology, chemistry and materials science, where function, reactivity and physical response emerge from spatial, chemical and periodic...

NLP / LLMs Jul 9, 2026

Co-LMLM: Continuous-Query Limited Memory Language Models

Yair Feldman, Linxi Zhao, Nathan Godey et al.

Limited memory language models (LMLMs) externalize factual knowledge during pretraining to a knowledge base (KB), rather than memorizing it in their weights. During generation, the model then fetches...

NLP / LLMs Jul 9, 2026

From Noisy Traces to Root Causes: Structural Trajectory Analysis and Causal Extraction for Agent Optimization

Ying Chang, Jiahang Xu, Xuan Feng et al.

The optimization of long-horizon agents increasingly relies on reflection-based mechanisms, where a large language model (LLM) acts as an optimizer to diagnose agent failures and improve agent...

AI / ML Jul 9, 2026

Breaking Database Lock-in: Agentic Regeneration of High Performance Storage Readers for Database Bypass

Victor Giannakouris, Immanuel Trummer

Analytical workloads operating on data stored in external database systems face a fundamental bottleneck: data access is guarded entirely by the database driver, like JDBC or ODBC, forcing all reads...

AI / ML Jul 9, 2026

Institutional Red-Teaming: Deployment Rules, Not Just Models, Causally Shape Multi-Agent AI Safety

Yujiao Chen

We introduce institutional red-teaming, an evaluation methodology for testing deployment rules in multi-agent AI: hold the agents, objectives, and task state fixed, vary only one rule, and attribute...

Machine Learning Jul 9, 2026

Selective Timestep Weighting and Advantage-Based Replay for Sample-Efficient Diffusion RLHF

Eric Zhu, Abhinav Shrivastava, Soumik Mukhopadhyay et al.

Reinforcement learning from human feedback (RLHF) has emerged as a powerful paradigm for aligning generative models with human preferences. However, applying RLHF to diffusion models remains highly...

Machine Learning Jul 9, 2026

Agon: Competitive Cross-Model RL with Implicit Rival Grading of Reasoning

Vladislav Beliaev

Reinforcement learning from verifiable rewards (e.g. GRPO) is the engine behind today's reasoning models, yet it grades only the final answer. On hard problems this trains models to write more rather...

Distributed Systems Jul 9, 2026

Scaling WaterLily.jl with MPI and an improved geometric multigrid solver

Bernat Font, Marin Lauber, Tzu-Yao Huang et al.

We present recent performance-oriented developments in WaterLily.jl, a scale-resolving incompressible flow solver written in pure Julia that runs seamlessly on CPUs and GPUs of any vendor. Supported...

AI / ML Jul 9, 2026

SkillCenter: A Large-Scale Source-Grounded Skill Library for Autonomous AI Agents

Tianming Sha, Yue Zhao, Lichao Sun et al.

Autonomous AI agents can execute complex tasks with limited human review, yet they often lack the grounded operational knowledge to make their outputs not just executable but correct, secure, and...

Machine Learning Jul 9, 2026

Max Out GRPO Signal: Adaptive Trace Prefix Control for Hard Reasoning Problems

Vladislav Beliaev

Group Relative Policy Optimization (GRPO) stalls on a model's hardest problems: when no rollout in a group succeeds, the group-relative advantages vanish and the problem contributes no gradient,...

NLP / LLMs Jul 9, 2026

Does Bielik Know What It Doesn't Know? Activation Dispersion Separates Entity Familiarity from Factual Reliability Across Model Scale

Grzegorz Brzezinka

Large language models hallucinate most about entities they have never seen. We ask whether a model's activations betray entity familiarity before a single answer token is generated, and whether that...

NLP / LLMs Jul 9, 2026

DiaLLM: An Investigation into the Robustness-Generation Gap in English Dialect Adaptation

Jordan Painter, Dipankar Srirag, Adarsh Kappiyath et al.

Large language models increasingly \emph{understand} dialectal English, yet still \emph{produce} only standard, US-leaning English, leaving dialectal generation, the harder half of the problem,...

AI / ML Jul 9, 2026

Recursive Self-Improvement in AI: From Bounded Self-Refinement to Autonomous Research Loops

Mingguang Chen, Licheng Wang, Bo Qu et al.

AI systems increasingly participate in their own improvement: revising their outputs, adapting their own harnesses during deployment, training on data they generate, and, increasingly, conducting AI...

Security Jul 9, 2026

Modeling Failure Dynamics in Time-Constrained Authentication Systems: Evidence of a Success Cliff in USSD Workflows

Aklile Seyoum Mamo, Amanuel Kebede, Anny Christelle Irakoze et al.

Time-constrained interactive systems such as USSD (Unstructured Supplementary Service Data)-based financial services operate under strict session limits and sequential user interaction. While...

AI / ML Jul 9, 2026

RL Post-Training Builds Compositional Reasoning Strategies

Azwar Abdulsalam, Nishil Patel, Andrew Saxe et al.

Does RL post-training merely amplify primitive skills already latent in a base model, or can it compose primitive skills into new higher-level strategies? We study this question in a fully observable...

Machine Learning Jul 9, 2026

ALER-TI: Aligned Latent Embedding Retrieval for Time Series Imputation

Xuan-Thong Truong, Trung-Kien Le, Tung Kieu et al.

Deep learning has significantly advanced time series imputation, yet most existing architectures primarily rely on localized temporal context within the corrupted input sequence. This reliance can be...

Security Jul 9, 2026

Unlearning to Protect: A Distilled Reinforcement Learning Framework with Privacy-Preserving Feature Unlearning and XAI for IoT Security

Md. Nahid Hasan, Golam Rabiul Alam

Botnets pose a significant cybersecurity threat, enabling attacks such as DDoS, data theft, and service disruptions on IoT devices. These devices often lack built-in botnet traffic filtering, leaving...

AI / ML Jul 9, 2026

QCNN with Rough Path Signature Kernels

Leonardo Nogueira Falabella, Vasily Sazonov

Time series analysis plays a vital role across a wide range of scientific and engineering domains but poses substantial computational challenges. A major difficulty arises from the time...

NLP / LLMs Jul 9, 2026

Future Confidence Distillation in Large Language Models

Sahil Kale

Reliable confidence estimation is essential for deploying large language models (LLMs) in confidence-aware systems, where downstream decisions such as retrieval, tool use, and adaptive computation...

Security Jul 9, 2026

Embedded Blockchain Infrastructure Management (eBIM): A RISC-V-Empowered Hardware--Software Co-Design Framework Towards Trustworthy Blockchain

Qinglin Yang, Yuan Liu, Yaoyao Zhang et al.

Blockchain systems are undergoing a fundamental transition from decentralized ledgers for digital assets to general-purpose trust infrastructures for verifiable computation, decentralized physical...

Software Eng Jul 9, 2026

Rethinking Code Performance Benchmarks for LLMs

Nhat Minh Le, Yisen Xu, Zhijie Wang et al.

Many function-level performance benchmarks have been proposed to evaluate whether large language models (LLMs) can generate efficient programs. However, results on these benchmarks often show that...

AI / ML Jul 9, 2026

Towards Agentic AI Governance: A Preliminary Assessment

Mubarak Raji, Masooda Bashir

Artificial intelligence is rapidly evolving from generative systems to agentic AI capable of autonomously planning and executing tasks. Widely characterized as the Year of Agentic AI, 2025 marked...

AI / ML Jul 9, 2026

CARLA-GS: Decoupling Representation, Reasoning, and Physics Simulation for Autonomous Driving Corner-Case Synthesis

Kaicong Huang, Meng Ma, Ruimin Ke et al.

Safety evaluation for autonomous driving is dominated by rare, safety-critical interactions, motivating simulators that can deliberately synthesize corner cases with photorealistic observations....

Software Eng Jul 9, 2026

Quantum Software Engineering in Practice: FPGA and AI Integration for Quantum Certification

Marcos Guillermo Lammers, José Manuel Suárez, Adrián Pousa et al.

The emergence of Quantum Software Engineering (QSE) responds to the need for systematic, disciplined, and quantifiable approaches to the development, operation, and maintenance of quantum software....

Security Jul 9, 2026

NARAD: Non-colluding Aggregator-oblivious Record-And-Decrypt

Akshit Vakati Venkata, Rajat Dugar, Ayush Adarsh et al.

Electronic voting must keep individual ballots private while letting anyone verify the final tally. This paper presents an architecture that meets both goals without a trusted key dealer: each voter...

Software Eng Jul 9, 2026

What Makes a Good Bug Report for an AI Agent?

Lara Khatib, Noble Saji Mathews, Meiyappan Nagappan et al.

Automated program repair (APR) agents are transitioning from research benchmarks to developer workflows, yet they still begin with bug reports written for human developers. While decades of research...

Machine Learning Jul 9, 2026

Multi-Class vs. Multi-Label BERT for CVE-to-CWE Mapping: How Taxonomy Structure Shapes the Errors

Ana Schwengber Kelm, Christian Bockermann, Jörg Frochte et al.

Assigning Common Weakness Enumeration (CWE) categories to Common Vulnerabilities and Exposures (CVE) records remains an important but largely manual step in vulnerability analysis. We study this task...

Machine Learning Jul 8, 2026

Collaborative Synthetic Data Generation for Knowledge Transfer in Federated Learning

Maximilian Andreas Hoefler, Karsten Mueller, Wojciech Samek et al.

One-shot federated learning (OSFL) addresses the communication overhead of federated learning by limiting training to a single round, but doing so without sacrificing model quality is non-trivial,...

NLP / LLMs Jul 8, 2026

PALS: Percentile-Aware Layerwise Sparsity for LLM Pruning

Yazdan Jamshidi, Alexey Shvets

One-shot pruning methods like Wanda and SparseGPT apply the same sparsity ratio to every layer of a transformer, ignoring known variation in layer importance. We propose PALS (Percentile-Aware...

NLP / LLMs Jul 8, 2026

Think Big, Search Small: Where Capacity Matters in Hierarchical Search Agents?

Qinnan Cai, Yibo Zhao, Xiang Li et al.

Large language model based search agents increasingly adopt multi-agent architectures in which a main agent decomposes a complex question into sub-queries and dispatches them to parallel sub-agents....

My Publication

Published research

Book Chapter · 2024

Enhancing AI Malware Detection Using Neural Network
with Binary Data Analysis

A peer-reviewed book chapter applying feedforward neural networks to raw binary executable data for malware classification — bypassing traditional signature-based detection methods. The approach demonstrated competitive detection accuracy without manual feature engineering, with the neural network outperforming baseline classifiers on precision, recall, and F1-score.

Published in

Atlantis Press, 2024

DOI

10.2991/978-94-6463-589-8_7

Type

Book Chapter

Python Neural Networks Binary Analysis Cybersecurity Scikit-learn

BibTeX citation

@inbook{sufi2024malware,
  title     = {Enhancing AI Malware Detection Using Neural Network
               with Binary Data Analysis},
  booktitle = {Proceedings of Atlantis Press},
  year      = {2024},
  doi       = {10.2991/978-94-6463-589-8_7},
  url       = {https://doi.org/10.2991/978-94-6463-589-8_7}
}

Read publication

Latest papers, simplified

Accurate, Interdisciplinary and Transparent Structure-property Understanding with Deep Native Structural Reasoning

Co-LMLM: Continuous-Query Limited Memory Language Models

From Noisy Traces to Root Causes: Structural Trajectory Analysis and Causal Extraction for Agent Optimization

Breaking Database Lock-in: Agentic Regeneration of High Performance Storage Readers for Database Bypass

Institutional Red-Teaming: Deployment Rules, Not Just Models, Causally Shape Multi-Agent AI Safety

Selective Timestep Weighting and Advantage-Based Replay for Sample-Efficient Diffusion RLHF

Agon: Competitive Cross-Model RL with Implicit Rival Grading of Reasoning

Scaling WaterLily.jl with MPI and an improved geometric multigrid solver

SkillCenter: A Large-Scale Source-Grounded Skill Library for Autonomous AI Agents

Max Out GRPO Signal: Adaptive Trace Prefix Control for Hard Reasoning Problems

Does Bielik Know What It Doesn't Know? Activation Dispersion Separates Entity Familiarity from Factual Reliability Across Model Scale

DiaLLM: An Investigation into the Robustness-Generation Gap in English Dialect Adaptation

Recursive Self-Improvement in AI: From Bounded Self-Refinement to Autonomous Research Loops

Modeling Failure Dynamics in Time-Constrained Authentication Systems: Evidence of a Success Cliff in USSD Workflows

RL Post-Training Builds Compositional Reasoning Strategies

ALER-TI: Aligned Latent Embedding Retrieval for Time Series Imputation

Unlearning to Protect: A Distilled Reinforcement Learning Framework with Privacy-Preserving Feature Unlearning and XAI for IoT Security

QCNN with Rough Path Signature Kernels

Future Confidence Distillation in Large Language Models

Embedded Blockchain Infrastructure Management (eBIM): A RISC-V-Empowered Hardware--Software Co-Design Framework Towards Trustworthy Blockchain

Rethinking Code Performance Benchmarks for LLMs

Towards Agentic AI Governance: A Preliminary Assessment

CARLA-GS: Decoupling Representation, Reasoning, and Physics Simulation for Autonomous Driving Corner-Case Synthesis

Quantum Software Engineering in Practice: FPGA and AI Integration for Quantum Certification

NARAD: Non-colluding Aggregator-oblivious Record-And-Decrypt

What Makes a Good Bug Report for an AI Agent?

Multi-Class vs. Multi-Label BERT for CVE-to-CWE Mapping: How Taxonomy Structure Shapes the Errors

Collaborative Synthetic Data Generation for Knowledge Transfer in Federated Learning

PALS: Percentile-Aware Layerwise Sparsity for LLM Pruning

Think Big, Search Small: Where Capacity Matters in Hierarchical Search Agents?

Published research

Enhancing AI Malware Detection Using Neural Network with Binary Data Analysis

Enhancing AI Malware Detection Using Neural Network
with Binary Data Analysis