High performance computing (HPC) accelerates the process of research discovery. Most domains now use HPC, and research is increasingly compute- and data-intensive as more scientists and engineers employ machine learning, neural networks and other artificial intelligence (AI) workflows.
In collaboration with other units on campus, ITS-Research Services supports the Argon HPC system which features ~16,000 CPU processors and more than 300 GPU accelerators.
1-1 user consultation is also available by emailing research-computing@uiowa.edu
Current Status
Maintenance
Past 90 Days
Alert History
The maintenance for Argon has been extended until Friday, August 16th at 6:00AM due to unexpected technical issues.
The maintenance of Argon HPC service has been adjusted to start earlier than originally scheduled, in order to accommodate work that will be done to update network infrastructure supporting research.
Running jobs will be killed at the start of the maintenance, and the service will be unavailable until the maintenance is complete.
The issue has been resolved.
There are currently issues scheduling or submitting jobs to Argon. We apologize for the inconvenience and hope to have things resolved quickly.
The Argon High Performance Computing (HPC) system will be down for maintenance.
Any jobs running at the beginning of the maintenance window will be stopped. Please plan your work accordingly for minimal disruption.
This issue has been resolved.
Some users are currently unable to access their home drives on Argon. Affected users may receive an error message like: “Could not chdir to home directory /Users/HawkID: No such file or directory” or "kex_exchange_identification: Connection closed by remote host" when trying to access their Argon home directory.
IDAS users might receive errors starting an IDAS session when requesting their hpchome be mounted.
This issue has been resolved. Please contact research-computing@uiowa.edu if you are continuing to see issues.
Some users may be unable to mount their home directory in Argon and receive an error message similar to "cannot access /Users/HawkID: No such file or directory". Support staff are investigating.
The Argon High Performance Computing (HPC) system will be down for maintenance. Any jobs running at the beginning of the maintenance window will be stopped but will resume after the maintenance.
The Argon High Performance Computing (HPC) system will be down for maintenance.
Any jobs running at the beginning of the maintenance window will be stopped. Please plan your work accordingly for minimal disruption.
This issue has been resolved. Please contact research-computing@uiowa.edu if you are continuing to see issues.
Some users are experiencing slowness or issues accessing Argon. ITS is currently investigating the issue.
The issue has been resolved.
Some users are experiencing issues accessing Argon. ITS is currently investigating the issue.
This issue has been resolved.
Some users are experiencing issues accessing Argon. ITS is currently investigating the issue.
The Argon High Performance Computing (HPC) system will be down for maintenance.
Any jobs running at the beginning of the maintenance window will be stopped. Please plan your work accordingly for minimal disruption.
This issue has been resolved.
The Argon HPC environment is currently experiencing issues with job submission. Already running work is not impacted, but you may see errors when submitting new jobs or requesting job status. We are investigating the issue. Thank you for your patience.
This issue has been resolved.
There are intermittent issues with itf-rs-store20, an Argon homes system. ITS is investigating.
This issue has been resolved.
Some users may be unable to mount their home directory and receive an error message similar to "cannot access /Users/HawkID: No such file or directory". Support staff are investigating.
The issue has been resolved.
The Argon HPC /nfsscratch filesystem is performing slowly for some jobs. The Research Services team is investigating and monitoring the situation.
The Argon High Performance Computing (HPC) system will be down for maintenance.
Any jobs running at the beginning of the maintenance window will be stopped. Please plan your work accordingly for minimal disruption.
The retention period of files on the Argon HPC system-wide /nfsscratch filesystem has been reduced from 60 days to 40 days after they are created.
The retention period was temporarily reduced to 40 days this fall, when usage of /nfsscratch increased significantly. Since restoring the retention period to 60 days, /nfsscratch has consistently reached capacity, which prevents jobs from running.
The retention policy for node-specific /localscratch filesystems is not affected by this change.
All scratch filesystems policies are available as part of the Argon Scratch Filesystems documentation.
This issue is resolved.
Some users may be unable to mount their home directory and receive an error message similar to "cannot access /Users/HawkID: No such file or directory". Support staff are investigating.
This issue has been resolved.
Users may experience degraded performance as the Argon HPC /nfsscratch filesystem has reached capacity.
To alleviate performance issues and avoid impacting running jobs further, we are now reducing the deletion of data to retention periods from 40 days instead of 50 days on the /nfsscratch system.
You can help by deleting data which is no longer needed from /nfsscratch, or contact research-computing@uiowa.edu to have your entire /nfsscratch directory deleted.
Please note that job launches will resume automatically once the share drops below the 99% threshold. However, if the share fills back up again job launches will once again be paused.
We are continuing to monitor the situation and may suspend job launches if it does not improve. We will communicate if additional actions are necessary.
Free space on /nfsscratch is decreasing rapidly, and at the current pace we expect it to fill in the next 1-2 days. If /nfsscratch fills, it will cause issues including HPC job failures and no new data will be written.
To help avoid filling /nfsscratch, please ensure you are removing data in /nfsscratch as you finish with it rather than waiting for the normal automated cleaning mechanisms.
In the event that /nfsscratch becomes full, you can use the /scratch file system.
ITS-Research Services is taking steps to slow the filling:
- HPC users writing to /nfsscratch with large data consumption have been contacted.
- Data is actively being removed on the backend.
This issue has been resolved.
ITF lost utility power today around 9:08 AM. This has impacted ITF HPC compute nodes. When the power switched over to the generator those rows/systems lost power during this brief period. This resulted in jobs running on any ITF HPC nodes to fail. These systems will run while ITF is using generator power, but do not restart automatically. Research Services is in the process of bringing the computer nodes back up.
Quarterly maintenance for the Argon HPC system.
See Fall 2022 HPC Maintenance for more information.
Quarterly maintenance for High Performance Computing and related services.
See Summer 2022 HPC maintenance for more information.
This issue has been resolved.
Some users are currently unable to access their home drives on Argon. Affected users may receive an error message like “Could not chdir to home directory /Users/hawkid: No such file or directory” when trying to access their Argon home directory.
IDAS users might receive errors starting an IDAS session when requesting their hpchome be mounted.
ITS is investigating this issue and updates will be posted on the HPC System News support page.
This issue is resolved.
Some users are currently unable to access their home drives on Argon. Affected users may receive an error message like “Could not chdir to home directory /Users/hawkid: No such file or directory” when trying to access their Argon home directory.
ITS is investigating this issue and updates will be posted on the HPC System News support page.
System updates will be installed during this maintenance window. Please refer to https://hpc.uiowa.edu/system-news/spring-2022-hpc-maintenance for more information.
This issue has been resolved.
Argon HPC is experiencing a service degradation which may cause jobs to fail with errors indicating inability to create a lock for an input or output file. It may also cause apps to freeze or become non-responsive.
For timely updates, please see: https://hpc.uiowa.edu/recent-news
See Winter 2022 HPC maintenance | High Performance Computing for more information and updates.
This issue has been resolved.
Argon and IDAS are currently unavailable. Users are seeing failed or stalled logins when attempting to connect to Argon. IDAS will still continue to function if Argon Homes or LSS are not being used. IDAS class instances are down if they have shared libraries or shared storage.
Research Services is investigating this issue. Further update information can be found at https://hpc.uiowa.edu/system-news/argon-login-issues
Logins to the Argon HPC system may periodically fail due to an unresponsive login node. ITS is currently investigating the issue.
Users experiencing issues connecting to argon.hpc.uiowa.edu should connect to a login node directly instead:
- argon-login-1.hpc.uiowa.edu
- argon-login-3.hpc.uiowa.edu
- argon-login-4.hpc.uiowa.edu
See Fall 2021 HPC maintenance | High Performance Computing for more information.
The Argon HPC system will be in a regularly scheduled quarterly maintenance. For more details, see: HPC Summer 2021 Maintenance
The issue has been resolved.
One of the servers for the Argon Home directories is having an issue with file locks. If your home account is on this server, you may experience processes that get stuck shortly after launching. The problem is being investigated and will be mitigated as soon as possible.
High Performance Computing Contact Information
ITS-Research Services - research-computing@uiowa.edu