Environment Management
This documentation provides a comprehensive guide to managing environments using Conda and Mamba. Provided Jupyter notebook images should have both Conda and Mamba pre-installed. You will learn how to create, list, activate, and manage environments, as well as how to install and maintain packages within them.
Environment Management vs. Package Management
- Environment Management: Environment management allows users to create isolated environments that contain specific versions of languages like Python and packages, ensuring reproducibility and dependency control. Environments are usually created and mainained per project.
- Package Management: Package management is concerned with installation, updates, and removal of packages, ensuring software dependencies are met within a given environment.
Understanding Conda and Mamba
Conda and Mamba are package and environment management tools primarily used in data science and software development.
- Conda is an open-source package and environment manager that helps users install, run, and update packages and dependencies efficiently. Conda comes with the Anaconda distribution of Python.
- Mamba is a faster, drop-in replacement for Conda, designed to improve performance when solving package dependencies and installing software. We recommend using mamba over conda whenever possible.
Mamba vs. Pip
While both Mamba and Pip are used for package management, they have key differences:
- Mamba (or Conda) vs. Pip:
- Mamba is an environment manager that handles binary packages and dependencies using Conda environments. It ensures package compatibility across different platforms and Python versions.
- Pip is the default Python package manager, primarily used to install Python packages from the Python Package Index (PyPI). Pip does not natively support environment management, but instead relies on virtual environments (e.g., created with the
venv
orvirtualenv
packages) for isolation. Pip does not manage dependencies as efficiently as Mamba. - Dependency Resolution: Mamba provides more robust dependency resolution compared to Pip, which may run into conflicts when installing packages.
- Speed: Mamba is significantly faster than Pip when solving complex dependencies since it uses a more optimized dependency solver.
Ensure the Base Environment is Activated
To start, activate the base environment:
source activate base
You should see the terminal add (base)
in front of your terminal line, like so:
(base) jovyan@jupyter-maztec-40sdsu-2eedu:~$
Note: You may modify the base environment, but any changes will be reverted when your notebook server is restarted. If you need a persistent environment read the section Creating a New Conda Environment.
Searching for Packages on Anaconda.org
Anaconda.org is a repository where users can find and download Conda packages.
The conda-forge
channel is a community-driven collection of packages that are regularly updated and maintained.
We recommend using the conda-forge channel whenever possible.
Searching for Packages via Terminal
To search for packages on via terminal, you can use the following command:
mamba search -c conda-forge [package-name]
Example:
mamba search -c conda-forge matplotlib
Using the Conda-Forge Channel
Many open-source packages are hosted on conda-forge
, and you may need to specify this channel when installing a package:
mamba install -c conda-forge [package-name]
Example:
mamba install -c conda-forge r-base r-gdistance
You can also add conda-forge
as a default channel to avoid specifying it every time:
conda config --add channels conda-forge
conda config --set channel_priority strict
Note: Modifying the config for an environment MUST be done using conda
over mamba
.
Creating a New Conda Environment
You may create persistent conda environments stored on your persistent storage.
To do so, you must use the --prefix
option over specifying a name.
If you specify a name, it will be created under /opt/conda
which will be reverted upon a notebook server restart.
You should keep in mind that additional conda environments will consume your persistent storage, so make sure to periodically check your storage consumption.
Conda environments must not be created on shared storage as this will negatively impact storage performance.
If you need to share an environment, you can export an environment to file and you may share the file via shared storage.
Create an Empty Environment:
mamba create --prefix [file-path-here]
Example:
mamba create --prefix ~/my-env
Create an Environment with One or More Packages:
mamba create --prefix [file-path-here] [package1] [package2] [package3]
Example:
mamba create --prefix ~/my-env r-base r-gdistance r-lme4
Listing Your Environments
To list all your environments:
mamba env list
You should see your new environment along with the base environment:
/home/jovyan/my-env
base * /opt/conda
Activating Your Desired Environment
To activate an environment:
conda activate [name or /path/to/environment]
Example:
conda activate /home/jovyan/my-env
Note: Activating an environment MUST be done using conda
over mamba
.
You should see your environment change in the terminal:
(/home/jovyan/my-env) jovyan@jupyter-maztec-40sdsu-2eedu:~$
Installing or Building Additional Packages
Once the environment is activated, you can install additional packages:
mamba install -y [package1] [package2] [package3]
Example:
mamba install -y r-base r-gdistance
Checking Installed Packages
To check what packages are installed in the activated environment:
mamba list
To check the version of a specific software package:
mamba show [package]
Example:
mamba show r-base
Now, you can run code from within this terminal session, and it will execute inside the activated environment.
Note: If you open a second terminal window, it will revert to the default state with no environment activated:
jovyan@jupyter-maztec-40sdsu-2eedu:~$
Reusing an Existing Environment
Activate Base Environment
Ensure the base environment is activated:
source activate base
List Your Environments
mamba env list
Activate Your Desired Environment
Assuming you are in your base environment, you can activate your desired environment:
conda activate [name or /path/to/environment]
Example:
conda activate /home/jovyan/my-env
Note: Activating an environment MUST be done using conda
over mamba
.
You should see your environment change in the terminal:
(/home/jovyan/my-env) jovyan@jupyter-maztec-40sdsu-2eedu:~$
Now, you can run your code in your environment from this terminal.
Exporting an Environment
Exporting an environment allows you to save its configuration in a YAML file. This is useful for sharing environments with others or reproducing the same setup on another machine.
Exporting an Environment to a YAML File
To export the currently activated environment:
mamba env export > environment.yaml
This will create a file named environment.yaml
that contains a list of all installed packages and their versions.
Exporting an Environment Without Build Information
If you want a cleaner export without build-specific details, use:
mamba env export --no-builds > environment.yaml
Recreating an Environment from a YAML File
To recreate an environment from an exported YAML file:
mamba env create -f environment.yaml
This will create a new environment with the same package versions as the exported one.
Sharing an Environment
You can share the environment.yaml
file with collaborators to ensure they use the same dependencies and configurations.