References and Aknowledgements

Dear IDUN/EPIC GPU users,

The EPIC/IDUN is a fantastic research tool. Since 2017 more than 28 international publications within a wide range of fields from NTNU/SINTEF has involved the use of the IDUN/EPIC GPGPU resources. To keep IDUN/EPIC updated and hopefully add resources we, all users and NTNU IT, need to be able to document what a fantastic research tool we have.
To keep track of scientific use of the machine we (IDI and NTNU IT) have an arXiv paper that describes the research infrastructure and list the scientific contributions that included IDUN/EPIC. However, to be able to keep track we need you all to include the reference (in paper or acknowledgement).  Such a reference enables us to document, and there by argue for support for maintenance and new investments in our research tool.
Please help to ensure that a reference in papers, PhD thesis and  master projects is in included.

Reference to IDUN/EPIC:
@article{Epic2019,
  title = {{{EPIC}}: {{An Energy}}-{{Efficient}}, {{High}}-{{Performance GPGPU Computing Research Infrastructure}}},
  shorttitle = {{{EPIC}}},
  author = {Sj{\”a}lander, Magnus and Jahre, Magnus and Tufte, Gunnar and Reissmann, Nico},
  year = {2019},
  month = dec,
  archivePrefix = {arXiv},
  eprint = {1912.05848},
  eprinttype = {arxiv},
  file = {/home/joh/.zotero/zotero/ga2rhggk.default/zotero/storage/BFIBWT2D/Själander et al. – 2019 – EPIC An Energy-Efficient, High-Performance GPGPU .pdf;/home/joh/.zotero/zotero/ga2rhggk.default/zotero/storage/XTB8TZRF/1912.html},
  journal = {arXiv:1912.05848 [cs]},
  keywords = {Computer Science – Distributed; Parallel; and Cluster Computing},
  primaryClass = {cs}
}  

Link to paper: https://arxiv.org/abs/1912.05848

Thanks
Gunnar Tufte
(Project coordinator for  EPIC/IDUN at IE/IDI)

Unplanned Idun Downtime

Dear Idun User.

We experienced a file system crash last 12 hours (Saturday 21st March, 2021). This caused the login to the system to become unresponsive and some of your jobs may have been lost.

If you had any jobs running between friday afternoon and Saturday afternoon, we urge you to take proper measures and check up on the output and general wellbeing of your files, jobs and data.

We also take the opportunity to remind you that Idun is not a place for storing data. Always back up valuable data and data elswhere.

Idun Upgrades and Changes

Dear Idun User.

We have started the work on and are well underway of transitioning Idun to a new platform.

The new system is available on:
idun-login4.hpc.ntnu.no.
This node does not have any GPUs, so compilation for GPUs is not possible on this node.

For you this means:
1: Upgrade of the OS to centos8. This operating system comes with significant improvements and will also give us a path forward as the older centos/ platform is seeing an “End of life” in the not so distant future.
2: upgrade from old slurm 16 to new slurm 20. This will give us better management of the queueing system and more features.
3: Change from Hierrarchical modules/software structure to flat structure. This might create som issues for your job scripts and you may have to change your setup, depending on what modules you are currently using.

Your files will stay the same.


We have already moved at least 25% of the nodes to new setup and some users have been testing the setup.
We encourage you to start using this login node and test your setup as well.

We hope to move the current login nodes; idun-login[1-3] in the following weeks, unless there are som major showstoppers.

epic.hpc.ntnu.no is currently pointing to a GPU enabled login node. Next tuesday, this node will be migrated to the new system.
idun.hpc.ntnu.no is currently pointing the old system. Next tuesday, this will start pointing to the new system.

The old idun login will continue to be available on: idun-login1.hpc.ntnu.no for a while after the transition, but will probably be transfered to new system some time before easter.

Please report any troubles to: help@hpc.ntnu.no



In other News:
* We have a new partner in Idun: Jan Torgersen of Department of Mechanical and Industrial Engineering. Welcome aboard
* Kavli Institute has decided to expand their share with 6 computenodes and to buy into a compute node
* IDI is currently considering expanding their share with FPGA node(s).
* We are planning to replace the current storage system on Idun later this year. Hopefully in late autumn.

Unplanned Idun Downtime

We experienced a few hours of unavailability on parts of the cluster between late wednesday 6th January and early afternoon on 7th January.


We are currently planning to upgrade Idun cluster into a newer Centos version, and in  the preparation we need to change internal IP addresses of all nodes (including login nodes). As part of this work we made a configuration error which caused some nodes to have wrong ip address config.

We will send out an update later this month about possible, planned, downtime for the transistion to an upgraded Centos version.

In other news: The IT department has expanded their share with 8 new compute nodes. This means that there are more “free” cpu hours available in addition to the individual stakeholders shares. We are expecting new shareholders shortly also.