Kurulum & Lisanslama Merkezi
HOW TO - Fix a Stuck Job (ARC, Linux)
Authored by Erdoğan Gökbulut February 12th, 2024 302 views 0 likes KB2403245
Description
When using RSM with the ARC scheduler, you can at times find that jobs may get stuck in a Submitted or Running state. Restarting the master or compute node has no effect.
Solution
- Attempt to terminate the job using the arckill command, from the headnode/arcmaster:
/ansys_inc/v###/RSM/ARC/tools/linx64/arckill (jobID)
- If the arckill command is unsuccessful at terminating the stuck job, delete the job database files on the arcmaster node and compute node. Database files are located in /home/rsmadmin/.ansys/v###/ARC/ on both nodes
- Restart the ARC services on the arcmaster node, the databases will be automatically recreated.