ML Batch Jobs¶
Running ML Code in a Batch Job¶
Batch jobs allow us to automate model training without human intervention. They are also useful when you need to run a large number of simulations on the clusters. In the example below, we shall run a simple tensor_hello.py script in a batch job.
Using a Custom Installation¶
Save the following code as tensor_hello.sub in the same directory where tensor_hello.py is located.
Running a Job¶
Now you can submit the batch job using the sbatch command.
Once the job finishes, you will find an output file (slurm-xxxxx.out).