Learning data science is like learning how to play a musical instrument — you must develop good habits and get the foundations straight to succeed.
Just like a musician requires scales, arpeggios, and rhythm exercises before being able to play concertos, a data scientist needs to ingrain key practices to develop their potential.
Avoiding detrimental habits and cultivating productive ones allows you to shift your mental focus from the mechanics to the artistry of your work.
Developing data science habits like using virtual environments and tracking experiments transforms your workflow from a struggle to a smooth-flowing creative process.
In this article, we’ll explore six everyday bad habits that can secretly destroy your effectiveness as a data scientist and provide tips to help boost your productivity.
A virtual environment is a siloed Python installation separate from your system environment. It lets you install packages and libraries for a specific project without affecting your system Python setup. Neglecting to use virtual environments can lead to dependency hell.
For example, in one of my first data science projects, I was building a machine learning model for image classification. I installed TensorFlow 2.0 globally to get started. A few weeks later my colleague gave me some code that required TensorFlow 1.x. Installing this caused all kinds of conflicts with my first project’s dependencies! I spent hours debugging before realizing I should have used virtual environments to avoid this mess. I couldn’t get the inherited code working until I set up a virtual environment to match my colleague’s original setup.
A virtual environment neatly sidesteps this issue by giving each project its own sandboxed space. Each environment has a dedicate python interpreter, pip and libraries.