Preparation

Check this out before proceeding to the actual part with dataset.

You need to be comfortable with the terminal. Linux/Mac machine is suggested (because all commands in this documentation will be for Linux/Mac terminal by default). Powershell on Windows is okay to do all the jobs but sometimes an entirely different story. Just want to make sure that you are informed the difference.

Before starting any task, you need to prepare your local environment:

  • A separable Python environment manager such as conda. Other solutions such as VirtualEnv also work. I highly recommend Miniforge, which supports Apple Silicon perfectly.

  • Python with at least 3.9 version (3.9 is okay, 3.8 might not). I suggest you to install a new 3.9 version in a new conda environment.

  • The access to a Slurm cluster (e.g. UMass Swarm, UMass Unity) is a plus, but not required.

Install following packages

We will need these packages (with specific version) for our tasks. Feel free to try the latest version! But if you have ever encountered an issue, please roll back to the specified version. For what has been changed, check documentation site of a package.

bloark==2.3.3

Last updated