Changes to Iridis Filestore and Login Node
On the morning of July14 2008 the old /scratch filesystems on Iridis was migrated to new hardware. This page describes the changes to the /scratch filesystems and also outlines how users might have been affected by similar changes to the /rhome filestore and also by a change of the default login node, both in November 2007.
The original Iridis /scratch filestore hardware is now out of maintenance and so the Iridis /scratch and /scratch1-5 filesytems on this hardware have been moved to more upto-date and extensible hardware so that we can provide a reliable service in future. Note that we will be keeping the old filestore hardware in commission for some time after the migration so that we will be able to revert to this in the unlikely event of any unforeseen difficulties.
Note also that the path for your /scratch directory as reported by the pwd command has changed after the migration. So a user, fred, originally in /scratch4/fred using the pwd command would have seen something like "/import/blue17/scratch4/fred" as the output for the command. After the migration the part of the path preceeding "scratch" will have changed, in this case to /import/blue21/scratch12/fred(/scratch4/fred and /scratch12/fred will both work ok for now, as /scratch4 & /scratch12 both point to import/blue21/scratch12). We will be making further changes after the transition so that you do not have to worry about which particular filesystem you are using for your /scratch directory - so don't worry too much for now about changing all your scripts to use say /scratch12 rather than /scratch4 .
Note: the old "/import/blue16|blue17" form of referencing file and directory names will fail for jobs run after the migration, so users running jobs in /scratch directories must check all job scripts and other relevant code to ensure that this form is not used and make changes if necessary.
New Naming system for /scratch directories
We intend to make a further renaming of the scratch directories to /work soon after the data migration. This change in name is felt to be more representative of the nature of these filesystems rather than the misleading term "scratch". The change will also mean that users do not need to worry which particular /scratch filesystem they are on, and will make it easier to move users between systems to accomodate changes in the space needed and the filestore available. Files will just be accessed as /work/username/subdir/file rather than /scratchN/username/subdir/file. More details soon but note that all the current /scratch names will be valid for some time, but they will not be the preferred name.
Change of Default Login Node
On November 19 the default Iridis login was changed to point to a different node.
Why are we making the change?
The change is necessary to ensure that the default login node is running the same version of Red Hat Enterprise Linux (RHEL 4.0) as the compute nodes. Until now, the default Iridis2 login node and the alternative login node, blue14.iridis.soton.ac.uk, were still built with RHEL 3.0. Most users will not be affected by the RHEL version, but there are differences in the versions of the glibc libraries and also in the versions of the Tcl/Tk libraries that can be significant for a few, so the change will ensure that the software build and runtime environments are more consistent. In addition, newer versions of applications such as Matlab, and the latest installed versions of the Portland Compilers will no longer run on RHEL 3.0 so the change is necessary if we wish to continue to update our software portfolio. Finally, if we are lucky the upgrade to RHEL 4.0 might help with the recurrent problem where the login node hangs at times.
What Changes will Users See?
Rebuilding the O/S on the node causes the SSH hostid to change.
Dealing wih changes of SSH hostid
The most obvious thing that is likely to concern users is that the first time you connect to iridis.soton.ac.uk after the change, SSH may report that the hostid has changed and issue potentially worrying warnings similar to the following:
"WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is" ....
This message can be ignored in this instance (though not in general, unless we have warned you in advance of a likely change). Normally ssh will let you ignore the warning and then connect to the new default node. However, it is possible with some configurations of SSH that a connection is not allowed. In this case you will probably need to delete the key entry for iridis.soton.ac.uk from the SSH key file. The procedure to do this depends on whether you are connecting from a Unix/Linux system or a Windows system.
On a Windows SSH client, select the "settings ... " option from the "edit" menu on the SSH window. This should fire up a seperate Settings window. In this window select "Host Keys" from the "Server Authentication" sub-menu on the left to get a list of public keys for various hosts. Select any lines with iridis.soton.ac.uk and use the delete button to remove the entry.
On Unix/Linux (and presumably Macs) use your favorite editor to modify the ~/.ssh/known_hosts on your local system. You will need to search for any lines containing "iridis.soton.ac.uk" and delete them.
If you still have a problem connecting to the new default login node using the alias "iridis.soton.ac.uk" you should still be able to connect using the full name "blue18.iridis.soton.ac.uk"
Blue14 upgrade
Note that the other alternative login node "blue14.iridis.soton.ac.uk" which has been running RHEL 3.0 will also be upgraded to RHEL 4.0, possibly before Mon 19 Nov, possibly afterwards. This upgrade will also have the effect of changing the SSH hostid for blue14, so the notes above will also be relevant when logging in to the rebuilt blue14 for the first time.
Problems compiling or running
A RHEL 4.0 login node, blue18.iridis.soton.ac.uk, has been provided for over a year and users have been advised to check that their codes will compile in the RHEL 4.0 environment on blue18 and that the resulting code runs satisfactorily on the compute nodes. The blue18 node will now become the default login node via the alias iridis.soton.ac.uk. We anticipate that no new problems should be identified at this stage. As there is one group of users who still have problems compiling on RHEL 4.0 the original login node will still be available via the full DNS address blue15.iridis.soton.ac.uk until a full solution valid for RHEL 4.0 can be found. If other users find that they do have unexpected problems compiling in the RHEL 4.0 environment, then please let us know so that we are aware that you still have a need for a RHEL 3.0 login node.
Migration of Iridis /rhome Filestore to new Hardware
The original Iridis filestore hardware is now out of maintenance and so the Iridis /rhome and /local filesytems on this hardware need to be moved to more upto-date and extensible hardware so that we can provide a reliable service in future. The new hardware has been installed and tested and we are now in a position to finalise the migration of all user data to the new hardware. For this final phase we need to shutdown all user access to systems which mount the /rhome filesystem. This means all Iridis compute nodes, all login nodes, Linuxresearch and Sky1.
An initial copy of the current /rhome filestore has been made on the new hardware, but in order to allow for any subsequent changes, since this initial copy, we need to do a final synchronisation on the morning of Mon 19, whilst the original /rhome is not being modified in any way - hence the need to shutdown all systems that allow users to modify files in /rhome. We would expect this to take most of the morning but hope that we will be able to restore service by the afternoon.
We do not anticipate any problems but please let us know if you spot any thing odd either with your data or with application perfomance after the migration. Note that we will be keeping the old filestore hardware in commission for some time after the migration so that we will be able to revert to this in the unlikely event of any unforeseen difficulties.
Note also that the path for your /rhome directory as reported by the pwd command will change after the migration. So a user, fred, in /rhome/fred using the pwd command would see something like "/import/raid1-LG1/rhome/c/fred" as the output for the command. After the migration the part of the path preceeding "rhome" will change. In general we would strongly recommend the use of the simple "/rhome/fred" form for access to your /rhome directory, but we suspect that there may be some users who use the "import" form in their scripts. We will create extra symbolic links that should ensure that scripts using the old "import" form still work, but if you do encounter any problems we would advise that you change to using the "/rhome/fred" form (as returned by the command "echo $cwd "). The /local filesystem will be affected in a similar way.

News feeds