Beyond the hype of Machine Learning and Artificial Intelligence that we have witnessed the past decade, lies something truly remarkable; the ability of machines to think, or rather our ability to devise techniques and make them ‘think’. To make them think fast and in scale, serving us in ways that the human insight might not be able to wander or comprehend.
It has been deemed the “hottest job of the 21st century”. I am sure a lot of enthusiasts out there would want to work for big companies such as FAANG (Facebook, Amazon, Apple, Netflix, Google) each one of them major players and pioneers in their respective domains, wouldn’t they? And more importantly… who has ever thought ‘If only I had that MSc in Artificial Intelligence from -insert university’s name here-…that would pave the way for me to land a job in one of those firms.
Well…I for one, definitely have thought about it. It is not a secret that some of those companies — if not all — no longer consider a MSc a major requirement to assign you to their roaster, however it is no secret either that many aspiring ML/AI engineers would happily acquire a MSc that would boost their confidence to apply for a job like this.
There comes the catch however… not everyone has the resources, economical or otherwise to chase that goal. Being unable to allocate time to a formal course in a university due to time and funding limitations, I thought I should give it a try by ‘copying’ the curriculum of a very prestigious university programme on AI. Being a resident in the UK, what better candidate than the UCL’s (University College London) Data Science and Machine Learning MSc course? UCL is one of the most renowned and prestigious institutions (see rankings), more importantly, this specific programme seems — at least to my eyes — a quite deep and broad approach to a plethora of subsects pertinent to classical and modern AI.
So, why not try and follow their curriculum then? Well, I am in no way affiliated with UCL, hence I have no access to their exact class material, so I thought I would build up my own curriculum with theirs as a guide. This in itself is a disclaimer that stresses the fact that unavoidably the curriculum I’ll be presenting is ONLY INSPIRED BY UCL and by no means strictly equivalent.
This will be part of a post series revolving around the huge domain of Artificial Intelligence. We will attempt to build a complete Master’s level curriculum that would surely boost the confidence of an AI enthusiast or prospective engineer/scientist. Roughly speaking, part 1 will be all about the core ML, part 2 about Deep Learning, part 3 about Reinforcement Learning, part 4 Miscellaneous, part 5 Robotics and modern applications.
We will try to provide a time schedule and cost of all the recommended resources as well as a rough outline for various scenarios.
As an extra disclaimer, that will probably act as a ruler to measure what I feel is an adequate background, I hold a MEng in Electrical Engineering and Computer Science with a fairly strong background in mathematics and an intermediate grasp of Programming and Computer science. That translates into good grasp of Linear Algebra, Calculus, Probabilities and an intermediate exposure to Python and C++. After all, who can claim to be an expert? What defines perfection? It’s all a constant struggle…
First things first. I will assume a beginner/intermediate level of knowledge in Python. If an absolute beginner and in need of an introduction to the language, I would definitely recommend freeCodeCamp (an excellent endeavour for which we all should be grateful) whose videos can be found online.
Without further ado..
Goals:
Starting with the basics of ML and given a good understanding of Python under our belt (see OOP etc) we can move to Applied Machine Learning. At this stage deep understanding of the mechanics of sophisticated algorithms is not required at all. It is sufficient to understand the high-level way that 5 to 10 major algorithms work and how to tune their hyperparameters. How to set-up a simple ML pipeline for your personal projects will come in very handy and understand basic but foundational methods of data pre-processing.
Resources (increasing difficulty):
1.ML Zoomcamp — A very well structured and broad online bootcamp-style course that is available for free and has been developed by Alexey Grigorev, and it is based on his book Machine Learning Bookcamp (mlbookcamp.com)
2. Mike Gelbart’s lectures of various UBC (University of British Columbia) courses. The undergraduate courses are exactly what you need to get you started in Applied ML while their graduate (UBC Master of Data Science courses) counterparts will get you going to the next level. A lot of the materials are available on Github while most of the lectures are available on YouTube.
3. Applied ML by Andreas C. Müller — A master’s level course from Columbia University and one of Scikit-learn’s core developers. A very interesting course with a lot of insights. A tad bit more advanced than the previous entry in the list.
4. Sebastian Raschka’s lectures contain a bit more mathematical theory regarding the inner workings of ML algorithms but are definitely a joy to watch. Still at the undergraduate level, they can be a very helping hand to get you to the starting MSc level. Sebastian has made available online both the lectures and his notes and we couldn’t be more grateful about that!
Bonus: Andrew Ng’s Machine Learning specialization on Coursera (this will be analysed in a future post). This couldn’t be absent from the list. The course has received a full make-over recently and has become even more accessible to beginners. You will learn next to one of the ‘celebrities’ of the ML community and one of the first that started making ML education accessible to the masses back in 2012. If you feel you’d like it more rigorous then the next one is for you.
5. Stanford’s CS229. The original course that Andrew Ng taught (this instance is taught by another instructor) including detailed notes and lectures. With the mathematical rigor you’d expect from a Stanford graduate course this makes it a very strong candidate for bringing you to that coveted MSc level.
Bits and pieces of the above would be in close alignment to UCL’s ‘Introduction to Machine Learning (COMP0088)’. Of course, 100% overlap is not to be expected and perhaps at this level the mathematical rigor with which the concepts are presented is not sufficient to be considered a Master’s level approach. However, it is a major first step into Applied ML and surely, the resources presented above set a strong foundation to delve further into the theory at a later stage.
Concepts that are covered in the resources above include:
########## Introduction to Supervised Learning
– Linear models for regression and classification: least squares, logistic regression.
– Concepts of overfitting and regularisation, L1 and L2 regularisation.
– Boosting, Decision Trees, Random Forests.
– Support Vector Machines.
– Deep Learning: Neural Networks for regression and classification, Convolutional Neural Networks, Recurrent Neural Networks.
########## Introduction to Unsupervised Learning
– K-means, Principal Components Analysis, Embeddings
– Deep Autoencoders, Generative Adversarial Networks.
When we get a good grasp of how to apply various ML algorithms and how to set-up our problems and data, we can dive a little bit more into the mechanics of the algorithms. Before starting reading research papers (we’ll get there) you can have a look at some free books that introduce a more mathematical approach to various algorithms:
1.Pattern Recognition by Christopher Bishop — An ‘oldie’ but valuable resource to any aspiring ML engineer’s library. Definitely one of the best expositions of the subject out there, heavy in mathematics but… purely awesome!
2. An Introduction to Statistical Learning (statlearning.com) — A more easily digestible attempt than the previous item in the list, now in its 2nd edition, it has also become a classic. It provides a smoother introduction and perhaps less mathematical rigor than Bishop’s text, but remains enjoyable and with a lot of clarity on some of the most important ML algorithms
3.Probabilistic ML by Kevin Murphy — If you feel the above items are light on mathematics and want to see a Bayesian approach of ML, then Kevin Murphy’s book might be exactly what you were searching for. Note of caution, it requires a pretty good grasp of mathematics at the graduate level (something that we’ll come back to in a future post).
But, because you don’t learn how to ride a bike just by reading a book or watching your friends doing it…
Datacamp and Kaggle:
Are the go-to places to practise your newly acquired skills on toy datasets or even — why not? — competitions. Additionally, Kaggle provides free mini-courses that give an introduction and the necessary tooling to dive into some practical projects.
Personal Projects
Just get out there and start analysing datasets foraging for insights or building some projects on these datasets that could be used by a layperson. The projects could range from simple models to predict a person’s proneness to a disease or the probability that they would like a particular movie… Or if you feel extravagant, build some models that predict stock prices (although that is believed to be somewhat impossible!) or other business related issues.
At this stage, it is not about building the next high-performing model that will achieve 99.99% accuracy and will save a company billions… the idea is rather to start developing projects according to your interests. This will be proven invaluable in your next career interview! Just make it happen, build an end-to-end system that would perform well even if the approach or code would never find its way to production.
A plethora of interesting datasets can be found both on Find Open Datasets and Machine Learning Projects | Kaggle and the UCI Machine Learning Repository. The additional advantage of Kaggle is that a lot of times you’ll be able to find notebooks with other people’s solutions or approaches to specific datasets. This can be a powerful tool in your development as an AI/ML Engineer.
Whether you are after applied knowledge or a more mathematical approach (or even philosophical!) to Machine Learning, the resources presented here will surely keep you busy and entertained!
Get out there and enjoy what ML has to offer!
Stay tuned for the next parts of the Autodidact’s curriculum…
I would be happy to know your thoughts below and whether you’ve used these resources yourself!