Resources | nguyen le

index

Papers

1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilitiesreinforcement-learningdeep-learningneurips-2025oral
The Value Equivalence Principle for Model-Based Reinforcement Learningreinforcement-learningworld-modelneurips-2020
Learning Awareness Modelsreinforcement-learningworld-modeliclr-2018
Embedded Agencyreinforcement-learningworld-modelarxiv
Maximum Likelihood Reinforcement Learningreinforcement-learningarxivtheory
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Freellmattentionneurips-2025oral
Does RL Really Incentivize Reasoning Capacity in LLMs?llmreinforcement-learningreasoningneurips-2025oral
Learning Dynamics of LLM Finetuningllmtrainingtheoryiclr-2025oral
Code Foundation Models to Agents: Surveyllmcodeagentssurveyarxiv
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Modelsllminterpretabilitysurveyarxiv
Bridging the Attention Gap: Complete Replacement Models for Complete Circuit Tracingllminterpretabilityarxiv
Pretraining Large Language Models with NVFP4llmtrainingarxiv
Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Trainingdiffusiontheoryneurips-2025oral
Foundations of Diffusion Modelsdiffusiontheorysurveyarxiv
Superposition Yields Robust Neural Scalinginterpretabilityscalingtheoryneurips-2025oral
Understanding Integer Overflow in C++systemscppcompiler
What Every Computer Scientist Should Know About Floating-Point Arithmeticsystemsnumerical-computingfundamentals
Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and Algorithmssystemsgpuarxiv

Books

Optimization for Machine Learning (Princeton)mathematicsoptimizationmachine-learning
Introduction to Online Convex Optimizationmathematicsoptimizationtheory
Introduction to Online Controlmathematicscontrol-theoryoptimization
Optimization for Machine Learning (Lecture Notes)machine-learningoptimization
Learning Theory from First Principles: Francis Bachdeep-learningtheory
Theory of Deep Learning: Sanjeev Aroradeep-learningtheory
An Infinite Descent into Pure Mathematicsmathematicspure-mathfoundations
Introduction to Homotopy Type Theorymathematicstype-theorylogic
Probabilities (Jean-Yves Ouvard)mathematicsprobability
Quantum Theory, Groups and Representations: An Introductionmathematicsquantum-mechanics
STAT 201A - Introduction to Probability at an advanced level (Berkeley)mathematicsprobability
Data Stream Algorithms (Lecture Notes)algorithms
The little book of Semaphoreconcurrent-programmingalgorithms
Notes on Theory of Distributed Systemdistributed-systems
Is Parallel Programming Hard, And, If So, What Can You Do About It?parallel-programmingalgorithms
Algorithms for Modern Hardwarealgorithmssystemshpcperformance
Quant Trading Guidequantitative-financetradingpractical
QUANT BIBLE (MIT Sloan Business)quantitative-financetheorycomprehensive

Tutorials

Smol Training Playbookllmtrainingpractical
Ultrascale Playbookllmtrainingscaling
Tiny LLM Serving in a weekllminference
Machine Learning Compilationdeep-learningcompiler
How to Scale Your Modelllmtrainingscaling
Understand Tinygradgpucompilerdeep-learning
Modal GPU Glossarygpu
torch.compile Manualpytorch
CPython Internalspythoninternalssystems
Write Your Own Virtual Machinesystemsvmcompilerhands-on
Dive into Systemssystemsfundamentalscomprehensive
Software Optimization Resourcessystemsperformance
How to Vulkan in 2026graphicsvulkangpu
Vulkan Guidegraphicsvulkangpu
Vulkan Guide (Khronos)graphicsvulkangpu
GPU Optimization for Game Devgraphicsgpu

Blog Posts

What Would Non-Linear Features Actually Look Like?interpretabilitytheoryllm
On neural scaling and the quanta hypothesisinterpretabilitytheoryllm
Dummy Guide to LLM Samplingllminferencepractical
Diffusion Language Models Deep Divellmdiffusion
Linear Attention: Does Attention Have a Softmax?llmattention
Understanding CUDA Compiler & PTX With a Top-K Kernelgpucudacompilerkernels
Deep Dive into Triton Internals (Part 1)gputritoncompiler
Inside NVIDIA GPUs: Anatomy of high performance matmul kernelsgpucudaperformancekernels
Triton Flash Attention Kernel Walkthroughgputritonattentionllm
Triton Linear Layout Conceptgputritonmemory
Flash Attention for 5090 in CUDA C++gpucudaattentionperformance
Inside Nvidia GPU: Discussing Blackwell's Limitations and Predicting Rubin's Microarchitecturegpuhardware
Worklog: Optimising GEMM on NVIDIA H100 for cuBLAS-like Performance (WIP)gpucudaperformancekernels
CUTLASS CUTE tutorialgpucudacutlass
NCCL from scratch: Writing my own communications librarygpudistributed-systems
Field Notes on Scaling MoE Expert Parallelism with DeepEPllmmoescalingdistributed-systems
NVFP4 Pretraining: From Theory to Implementation (Part 1)llmquantizationtraininggpu
Explorations of RDMA in LLM Systemsllmsystemsnetworkingdistributed-systems
Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inferencellminferencesystemsperformance
the bug that taught me more about PyTorch than years of using itpytorchdeep-learningdebugging
PagedAttention from first principlesllminferencevllm
Understanding LLM Inference Engines: Inside Nano-vLLMllminferencevllm
Inside vLLM: Anatomy of a High-Throughput LLM Inference System: Accompany notebook: https://modal.com/notebooks/modal-labs/charles-dev/nb-x2wXrLH7aqi7HGVQ8Fosh2llminferencevllm
Distributed GPTllmtrainingdistributed-systems
Defeating Nondeterminism in LLM Inferencellminference
Demystifying Reasoning Modelsllmreasoning
Mixture-of-Experts (MoE) LLMsllmmoe
nanoMoE: Mixture-of-Experts (MoE) LLMs from Scratch in PyTorchllmmoe
No Graphics APIgraphicssystemsarchitecture
Allocators from C to Zigsystems
Beginner's Guide to Linkerssystems

Courses

INF6953PE: Deep Learning Dynamics (Montreal)deep-learningtheory
CSC2541: Topics in Machine Learning: Neural Net Training Dynamics (UofT)deep-learningtheory
Advanced Topics of Deep Generative Modelsdiffusiongenerative-modelsdeep-learning
Stanford CS 228 - Probabilistic Graphical Modelsdeep-learningtheoryprobabilisticstanford
Stanford CS 229M - Machine Learning Theorydeep-learningtheorystanford
Stanford CS336: Language Modelling From Scratchllmstanford
CMU 15-440/640 Distributed Systemsdistributed-systemssystemscmu
MIT 6.5840: Distributed Systemsdistributed-systemssystemsmit
Harvard CS121: Introduction to TCStheorycomputer-scienceharvard
Harvard CS127: Cryptographycryptographytheoryharvard
MIT 6.8210: Underactuated Roboticsroboticsmit

Videos

ML Performance Reading Group (Eleuther AI)llmcuda
Reinforcement Learning from the bookreinforcement-learning

Repositories

simple-llm: Focus on training and implementing a MoE model (GPT-OSS)llmeducationalimplementation
picotronllmeducationaltrainingimplementation
slimellmpost-trainingimplementation
mini-sglangllmeducationalinferenceimplementation
nano-vllmllmeducationalinferenceimplementation
llm.qllmquantizationimplementation
Course on Flash Attention in Tritonllmattentiontriton
CUTLASS tutorialcudagpucutlass
MLIR Tutorialcompilermlirtutorial
Prediction Market Analysis: A cool repo contains huge dataset for market analysisquantitative-financedata
Data Structures in Practicealgorithmsdata-structureseducational

Theses

Decomposing Deep Neural Network Minds into Parts: Eric J. Michaudthesisinterpretability
Generalization Through the Lens of Learning Dynamics: Clare Lylethesismachine-learning
VARIATIONAL ALGORITHMS FOR APPROXIMATE BAYESIAN INFERENCE: Matthew J. Bealthesisbayesian
Learning to Learn with Gradients: Chelsea B. Finnthesisreinforcement-learning
Reinforcement Learning and Simulation-Based Search in Computer Go: David Silverthesisreinforcement-learning
APPRENTICESHIP LEARNING AND REINFORCEMENT LEARNING WITH APPLICATION TO ROBOTIC CONTROL: Pieter Abbeelthesisreinforcement-learning
Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs: John Schulmanthesisreinforcement-learning
Deep Multi-Agent Reinforcement Learning: Jakob N. Foersterthesisreinforcement-learning
On the Sample Complexity of Reinforcement Learning: Sham Machandranath Kakadethesisreinforcement-learning
Applied Optimal Control for Dynamically Stable Legged Locomotion: Russell L. Tedrakethesisrobotics

Robotics

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learningroboticsreinforcement-learninghumanoid
VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulationroboticshumanoidsimulation
BAAI Thor Towards Human-level WhOle-body Reactions under Intense Contact-Rich Environmentsroboticshumanoidcontrol
GentleHumanoid Learning Upper-body Compliance for Contact-rich Human and Object Interactionroboticshumanoidmanipulation
CHIP: Learning Adaptive Compliance for Humanoid Control through Hindsight Perturbationroboticshumanoid
RoboStriker: Hierarchical Decision-Making for Autonomous Humanoid Boxingroboticshumanoidmotion-planning
HumanX Toward Agile and Generalizable Humanoid Interaction Skills from Human Videosroboticshumanoidmotion-planning
RPL: Learning Robust Humanoid Perceptive Locomotion on Challenging Terrainsroboticshumanoidperception
APEX Learning Adaptive High-Platform Traversal for Humanoid Robotsroboticshumanoid
ExtremeControl Low-Latency Humanoid Teleoperation with Direct Extremity Controlroboticshumanoidteleoperation
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactionsroboticshumanoidphysics-simulation
Humanoid Locomotion as Next Token Predictionroboticshumanoidllm
Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforcement Learning Work on Real Robotsroboticsworld-modelreinforcement-learning