AI/ML

So you want to be a data scientist… If this is your first time here and you’ve come to learn a new skill sit down and buckle up, this should be a long journey. Don’t go it alone watch these great talks on learning.

General Coding

These are mostly python (or language-agnostic) resources. Resources specific to non-python languages are listed below.

Quick References

Big-O Cheat Sheet

Resources

The Zen of Python and The Rules of Extreme Programming
PEP 8 - Style Guide for Python Code
RealPython Tutorials
Calmcode.io - tools for python
Coding Interview University - A long course on programming basics (data structures, sorting algorithms, graphs, recursion & dynamic programming, sets, etc…)

Learn Python Courses

PyNative
RealPython’s Object Oriented Programming in Python 3 + video
Python Programming Exercises, Gently Explained
The Big Book of Small Python Projects
  When you’re done try to code the problems in this book yourself.
Beyond the Basic Stuff with Python
Automate the Boring Stuff with Python

Coding Practice

The almighty LeetCode
Coding Bat (Java + Python)
Project Euler

YouTube Channels & Series

CS Dojo - shorts for general programming tips

Non-python resources

Git Documentation
Java Hypertext from Cornell’s CS 2110: OOP and Data Structures with David Gries
MIT’s 18.S191/6.S083/22.S092: Introduction to Computational Thinking (Julia)
Cornell’s ORIE 6125: Computational Methods in Opeations Research (Shell+Julia) with Vasileios Charisopoulos

AI and Machine Learning (AI/ML)

Tools

Data Version Control (DVC)
Keras Core (Wrapped into Keras3)
Debugging PyTorch
Google Colab
Paperspace Gradient

Online References

ML Interview Bible by Chip Huyen
AI Summer

AI/ML Courses

If its been a while since your last math course best brush up on the math basics before going any further.
Andrew Ng’s Machine Learning Series + Coursera
Cornell’s CS 4/5780: Intro to Machine Learning with Anil Damle and Kilian Weinberger + Online lecture videos

Software Tutorials

Github repo for Cornell’s ChemE 6880: Industrial Big Data Analytics & Machine Learning with Fengqi You

Deep Learning Courses

FastAI’s Practical Deep Learning for Coders + textbook and resources
MIT’s Introduction to Deep Learning
Yann LeCun’s Deep Learning Course at NYU CDS (PyTorch)
Cornell’s CS 4787/5777: Principles of Large Scale Machine Learning with Christopher De Sa
Stanford’s CS 230: Deep Learning

Software Tutorials

Google’s ML Crash Course (TensorFlow) + TensorFlow Tutorials
PyTorch Beginner Course + YouTube Series
GitHub for Cornell’s ChemE 6888: Deep Learning with Fengqi You

Courses in Deep Learning Applications

Stanford’s CS 231n: Deep Learning for Computer Vision
Stanford’s CS 236: Deep Generative Models with Stefano Ermon
CMU’s CS 11-747: Neural Networks for NLP

Reinforcement Learning Courses

Cornell CS 4/5789: Introduction to Reinforcement Learning + Online lecture videos
Spinning Up in Deep RL from OpenAI + GitHub Repo

AI/ML Practice

Kaggle

Blog Posts

Oren Etzioni’s “How to get up to speed on Machine Learning and AI”
Faizan Shaikh’s “Simple Beginner’s Guide to Reinforcement Learning & Its Implementation”
Harsh Sikka’s “The Blunt Guide to Mathematically Rigorous Machine Learning”
Harsh Sikka’s “The Math Required for Machine Learning”

YouTube Channels & Series

3Blue1Brown - shorts for maths
Especially the series on Neural Networks
CS 4/5780 online lecture videos by Kilian Weinberger

Textbooks

Alice’s Adventures in a Differentiable Wonderland by Simone Scardapane
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Machine Learning: A Probabilistic Perspective by Kevin Patrick Murphy
The Elements of Statistical Learning by Trevor Hastie and Robert Tibshirani and Jerome Friedman
Grokking Deep Learning by Andrew W. Trask

Cheminformatics

Quick Resources

Daylight Theory Manual (pdf)

Tools

RDKit + Cheat Sheet
RDChiral - Wrapper for RDKit’s RunReactants to improve stereochemistry handling (Paper)
Open Babel - Chemical Format Conversion
PyTorch Geometric - easy GNNs for PyTorch
Generative Toolkit for Scientific Discovery (GT4SD) - IBM Zurich porject, lots of built in training workflows
MolFlux - Molecular modeling toolkit from Exs
PhysicsML - Toolkit for physics-based modelling from Exs
Chainer is out of support as of Dec-2019Chainer Chemistry GitHub - A deep learning framework for Biology and Chemistry + Docs

Blogs

Practical Cheminformatics by Pat Walters
Cheminfomania by Esben Jannik Bjerrum

Online References

Scientific Computing for Chemists with Python by Charles J. Weiss
Deep Learning for Moleucles & Materials by Andrew White

Math Basics

Quick References

No Bullshit Guide to Linear Algebra in 4 Pages
Linear Algebra Review from Stanford’s CS 229
Probability and Statistics Review from Stanford’s CS 229 + short version
Calc III Study Guide

Online Math Courses

Gilbert Strang’s Linear Algebra - OCW
Denis Auroux’s Multivariate Calculus - OCW
Tom Leighton’s Discrete Mathematics for Computer Science - OCW