Skip to content

Data Engineering & MLOps

This section is aimed at data scientists wanting to get familiar with data engineering & MLOps. Resources are meant to be introductory and not a full representation of the entire fields.

For an extensive list of data engineering resources, refer to Reddit's Data Engineering Community.

For an extensive list of MLOps resources, refer to Awesome MLOps.

Books

Designing Machine Learning Systems | Learn a holistic approach to designing ML systems | Chip Huyen

Programming

Title Description & Context Source
What every programmer absolutely, positively needs to know about encodings and character sets to work with text  Everything you need to learn about encodings
Python Practice Problems for Beginner Coders Practice problems for people that find Leetcode/Hackerrank too complicated Berkely

Git

Title Description & Context Source
Introduction to Git Good beginner blogpost on the basics of Git Made with ML
How to Contribute to an Open Source Project on GitHub Videos detailing the process of contributing to an Open Source project on Github Egg Head
git exercises: navigate a repository  Exercises to help you learn Git  Julia Evans

Jupyter Notebooks

Title Description & Context Source
Jupyter Notebook Tips and Improvements Notebook extensions and hotkeys to work more efficiently Beginner

Packaging

Title Description & Context Source
Packaging a Python Codebase Guide on how to package your code with Python Made with ML

Testing

Title Description & Context Source
Testing in Python Learn how to make unit & integration tests for your code  Real Python

Docker

Title Description & Context Source
Docker for data scientists — Part 1 Blog series about creating a first project with Docker towards data science

SQL

Title Description & Context Source
How To Create a SQL Practice Database with Python Practice creating & querying an SQL database with fake data towards data science
Don’t Make These 5 Mistakes with SQL Tips and examples on common SQL features Intermediate