Library

A collection of papers (and code) that were at one time or another deemed interesting enough to hang on to…

Deep Learning

LLMs and Agents

  • Instruction-Following Pruning for Large Language Models
    Bairu Hou, Qibin Chen, Jianyu Wang, Guoli Yin, Chong Wang, Nan Du, Ruoming Pang, Shiyu Chang, Tao Lei
    cs.CL on arXiv 2025
     The paper proposes “instruction-following pruning,” a dynamic structured pruning method for large language models (LLMs). It utilizes a sparse mask predictor that adapts based on user instructions, optimizing both the predictor and the LLM using instruction-following data. Results show that a 3B activated model outperforms a 3B dense model by 5-8 points in specific domains, matching a 9B model’s performance.

  • Inconsistency of LLMs in Molecular Representations
    Bing Yan, Angelica Chen, Kyunghyun Cho
    Theoretical and Computational Chemistry on ChemRxiv 2024
     The paper investigates the consistency of large language models (LLMs) in molecular representations like SMILES and IUPAC names. Despite finetuning with a dual representation dataset and applying a Kullback-Leibler divergence loss for training, the models exhibited less than 1% consistency and no improvement in accuracy. Findings highlight the limitations of LLMs in understanding chemistry.

  • MemGPT: Towards LLMs as Operating Systems + GitHub Repo
    Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez
    arXiv 2024
      Infinite context for lanuage models. Now pacakged as part of Letta.

  • AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning + GitHub Repo
    Shirley Wu, Shiyu Zhao, Qian Huang, Kexin Huang, Michihiro Yasunaga, Kaidi Cao, Vassilis N. Ioannidis, Karthik Subbian, Jure Leskovec, and James Zou
    NeurIPS 2024

  • ReAct: Synergizing Reasoning and Acting in Language Models + Project Site
    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao
    ICLR 2023
      ReAct integrates reasoning and acting in LLMs by interleaving reasoning traces with task-specific actions. It reduces hallucinations in QA (HotpotQA) and fact verification (Fever) via Wikipedia API interactions and outperforms imitation and RL methods in ALFWorld (+34%) and WebShop (+10%). ReAct enhances interpretability and decision-making with minimal in-context examples.

  • LoRA: Low-Rank Adaptation of Large Language Models + GitHub Repo
    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen
    ICLR 2022
      LoRA is now wrapped into the 🤗 PEFT library

  • Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela
    NeurIPS 2020

Neural Reasoning & Decision Making

Contrastive Learning

Recommender Systems

Molecular Generation

Generative models for molecules. Most typically text-based inputs (SMILES/SELFIES) or graph reps (parallel models on atom and bond matrices). Usually have some property optimization ability (latent space search/interpolation, reinformcement learning, guided genetic exploration). Most commonly these methods are autoregressive, but more recently non-autoregressive molecular generation methods have started to arise.

Reviews

Diffusion / Flow Matching Models

Normalizing Flows

  • MolGrow: A Graph Normalizing Flow for Hierarchical Molecular Generation (No implementation available)
    Maksim Kuznetsov and Daniil Polykovskiy
    Proceedings of the AAAI Conference on Artificial Intelligence 2021, 35 (9), 8226-8234
      Heirarchical normalizing flow for molecular graphs, autoregressive. Builds either BFS or fragment based (better). Model is composed of “plug-and-play” modules. Trained on MOSES, QM9, Zinc250k. Property-constrained optimization is based on genetic algorithm.

  • FastFlows: Flow-Based Models for Molecular Graph Generation
    Nathan C. Frey, Vijay Gadepally, and Bharath Ramsundar
    ELLIS Machine Learning for Molecule Discovery Workshop 2021
      Framework for normalizing flows from SELFIES. Uses substructure filtering to speed up training and work from small training sets. Built in MPO functionality.
    TDS article

  • MoFlow: An Invertible Flow Model for Generating Molecular Graphs + GitHub Repo
    Chengxi Zang and Fei Wang
    in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2020
      Non-autoregressive normalizing Flow for molecular graphs; two-stage flow (bonds (based on GLOW network from Nvidia) > bond-conditioned flow for atoms). Similar to GraphNVP. Trained (NLL) on QM9 and ZINC250k. Developed new architecture. Excellent results.

  • GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation + GitHub repo
    Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang
    ICLR 2020
      How to explain this better than reviewer #1…

"This paper proposes a generative model architecture for molecular graph generation based on autoregressive flows. The main contribution of this paper is to combine existing techniques (auto-regressive BFS-ordered generation of graphs, normalizing flows, dequantization by Gaussian noise, fine-tuning based on reinforcement learning for molecular property optimization, and validity constrained sampling). Most of these techniques are well-established either for data generation with normalizing flows or for molecular graph generation and the novelty lies in the combination of these building blocks into a framework."

GANs

Other

Reaction Informatics

These models predict mechanisms for chemical reactions, ideally similar to how we teach 2nd years to push arrows. There are reltatively few of expamples of this task but they fall into 3 major categories electron flows, graph edits, reaction netowrks. At inference these models are used for forward synthesis prediction, potntially for prediction of chemo/regio-selectivity. Largely trained on pattern recognition from atom-mapped inputs (USPTO) though there are exceptions (e.g., Baldi papers below).

Electron Flow Prediction

Sources and Sinks

The Baldi papers map e- sources and sinks, combinatorially generates probability distribution of electron flows. Described classifiers are used to filter source-sink pairs before eval. Trained on in-house (unavailable) data. Papers don’t have available source code but ready-to-use programs are available on ChemDB.

Reaction Network Graphs

Other

Atom Mapping

Computer-Aided Retrosynthesis Planning

Publication Parsing

ML Driven Drug Design

Property/Activity Prediction

Active Learning Methods

Synthetic Accessibility

Molecular Optimization

  • A Zero-Shot Single-point Molecule Optimization Model: Mimicking Medicinal Chemists’ Expertise
    Peng Gao, Jie Zhang, Zhilian Dai, Yangyang Deng, Dan Zhang, Jiawei Fu, Songyou Zhong, Yichao Liu
    Theoretical and Computational Chemistry on ChemRxiv 2024
     The paper presents the Single-point Chemical Language Model (SpCLM), a framework for molecular design that mimics medicinal chemists’ expertise. Using a few hundred generated compounds, SpCLM predicts 60%-80% of active compounds in tests, correlating well with experimental data. This method reduces the need for extensive screening, offering a data-driven approach to optimize drug activity and selectivity.

  • Projecting Molecules into Synthesizable Chemical Spaces
    Shitong Luo, Wenhao Gao, Zuofan Wu, Jian Peng, Connor W. Coley, and Jianzhu Ma
    ArXiv Preprint, 2024
      Interesting new approach to making molecules more synthesizable from genenerated virtual hits. Cleaning the chaff energy. Describes a new postfix notation (A B +) for synthetic transformations. Transformer-based model that translates graphs to postfix notation. Model capable of synthesis planning, generating similar and more synthesizable analogues, exploring chemical space in the syntesizablilty dimension.

  • Evolutionary Multiobjective Molecule Optimization in an Implicit Chemical Space + GitHub Repo
    Xin Xia, Yiping Liu, Chunhou Zheng, Xingyi Zhang, Qingwen Wu, Xin Gao, Xiangxiang Zeng, and Yansen Su
    J. Chem. Inf. Model. 2024, 64, (13), 5161
      Multiobjective molecule optimization framework (MOMO) is a pareto-based MPO tool that evolves moelcules into better molecules. Genetic/ecolutionary algorithm in the latent (implicit) space ended by a VAE.

Large-scale Virtual Screening

Cheminformatics

Reviews

General

Δ-machine learning

Protein Structure Prediction

Chemistry

Med Chem

My papers