![]() |
Environment from the Molecular Level A NERC eScience testbed project |
This page is the place to store questions that have been answered in the Forum section and are of sufficient interest to record.
This is most likely due to slight differences in time between your client machine and the server machine. Gsi, the globus grid security infrastructure, is very time sensitive. This allows protection, for example, against users backdating their computer to make an expired proxy valid again.
All our resources are synchronized using the NTP (network time protocol) server: ntp2.ja.net Please configure your client machine to use this NTP server to ensure synchronization with the servers. (email helpdesk@eminerals.org for more information). Clovis
Here's something not everyone knows, ssh, scp and sftp and their gsi counterparts all use different syntax to target a port:
gsissh -p 2222 lake.esc.cam.ac.uk(note difference in case)
gsiscp -P 2222 lake.esc.cam.ac.uk (actually o passes any argument to ssh)
gsisftp -oPort=2222 lake.esc.cam.ac.uk
Depending on how you've set up you gsis* commands you may be able to type
ssh -p 2222 lake.esc.cam.ac.uk
scp -P 2222 lake.esc.cam.ac.uk
sftp -oPort=2222 lake.esc.cam.ac.uk
to access lake. The gsis* commands work as conventional ssh/scp/sftp clients too:
gsissh jwak02@hartree.hpcf.cam.ac.uk
works fine. (Jon)
First use gsissh to get onto the target cluster (note that you'll have to do this for each cluster). Then follow these instructions:
mkdir ~/.srb
cd ~/.srb
Create the file .MdasEnv and add the following lines (do not include the <> brackets):
mdasCollectionHome '/home/<your SRB user name>.eminerals'
mdasDomainHome 'eminerals'
srbUser '<your SRB user name>'
srbHost 'eminerals.dl.ac.uk'
srbPort '5544'
defaultResource '<a valid vault,e.g. LakeUCLVault, CCLRCFS or CambsLake>'
AUTH_SCHEME 'ENCRYPT1'
Also create the file .MdasAuth and add the following line:
<your SRB password>
Then
chmod 600 .M*
(Mark)
See this instructions page for full instructions.
The Adobe plugin for Windows and Apple computers can be obtained here.
Mozilla no longer uses the Adobde SVG plugin, instead you must build mozilla from source with SVG support. Download mozilla and do this:
./configure --enable-svg --enable-svg-renderer-libart --prefix=/usr/local/moz-svg
make
make test
make install
if you are using Linux. However, the web page containing the SVG will also need to be changed, because the plugin uses the <embed> tag while the Mozilla uses namespaces for distinuguishing SVG. (Jon)
It depends which libraries you want to use. Currently there are LAM and MPICH ones, each built with Gnu and Intel compilers. Hence, to use the MPICH ones add the following to your .bashrc:
export MPICH=/opt/mpich-intel or /opt/mpich-gnu
export PATH=$MPICH/bin:$PATH
export RSHCOMMAND=ssh
To use the LAM ones add instead
export LAMHOME=/opt/lam-intel or /opt/lam-gnu
export PATH=$LAMHOME/bin:$PATH
export LAMRSH="ssh -x"
(Mark 26/05/2004)
There are BLACS and SCALAPACK libs built with the Intel compilers for both the LAM and MPICH flavours. These are both built using the ATLAS versions of BLAS. Hence, find the relevant version of BLACS in /usr/local/BLACS/[lam or mpich]-intel. The SCALAPACK libs are in /usr/local/lib as libscalapack-[lam or mpich]-intel.a, together with the BLAS libs.
(Mark 26/05/2004)
First, make sure you understand how to submit a simple, single-node, non-MPI job (see the example I've put in the SRB at mcal00.eminerals/dag_srb_pbs). Now consider submitting a VASP job that requires four CPUs. Then the corresponding Condor submit script will look something like:
======================
Universe = globus
Globusscheduler = lake.esc.cam.ac.uk/jobmanager-pbs
Executable = vasp-lam-intel
Notification = NEVER
Environment = LAMRSH=ssh -x
GlobusRSL = (job_type=mpi)(count=4)(queue=workq)(mpi_type=lam-intel)(directory=/home
/mcal00/Test)
transfer_files = ALWAYS
stream_output = false
stream_error = false
Output = job.out
Log = job.log
Error = job.error
Queue
======================
A few comments. The executable needs to be built for the a specific MPI flavour, so here we've built it for the LAM distribution using the Intel compilers. Next we have to pass an environment variable (setting it in your .bashrc on the lakes is not enough). This one is needed for using LAM libs; if you want to use MPICH instead then replace that with:
Environment = RSHCOMMAND=ssh
Next comes the GlobusRSL. I'm assuming here that we're using the SRB to extract input files into the working directory /home/mcal00/Test (if you use the example for a non-MPI, single node job mentioned above then this value will be set for you), so I'm not going to bother to transfer input/output files. Note that I've asked for four CPUs: "count=4". The other tag of interest in the RSL is the non-standard one mpi_type. This allows you to select which version of MPI you want to use; the allowed values on the lakes currently are lam-intel, lam-gnu, mpich-intel or mpich-gnu.
(Mark 31/05/2004)
At the moment you need to use the Globus Alliance's own documentation; unfortunately this now only has documentation of GT 3, the only supported version.
(Mark 18/05/04)
Assuming you're using the eMinerals VV, then you'll need the following five holes for udp traffic (these are port numbers): 47000 and the range 50480-50483
(Mark and Rik, 19/05/04)
Go to a directory in my area on the SRB (mcal00.eminerals/bin) and get the tool "lakes". Then chmod +x it and after starting a proxy (e.g. on fried or silica) you can use it as:
lakes arg where arg = bath, cam, rdg, ucl or all
Note that the bath option won't work for now until that machine is fully configured, but the others will. Further note that if you're using this tool on silica then change the first line in the script from #!/bin/sh to #!/sbin/sh due to silly IRIX convention.
(Mark 24/06/2004)
1) I found that the name of the job cannot start with a number, it has to start with a letter, e.g. "2nodes", does NOT work, "nodes2" WORKS.
2) It looks like when a job is running on the lakes it copies the files the program needs to a temporary directory, and after the job finished the files are copied back to the 'initial' directory. In this process it looks like hidden characters are added to the files (e.g. ^M) (as it does with windows/PC; it is a PC-cluster we are running on.) For example, I need a wave-function file to run my quantum monte-carlo calcualtions. I can use this wavefunction file oncw, and the program runs without problem. When I use this same wavefunction file to restart my job, it complains that the program cannot read the wavefunction file, or some other error. If I recreate (or ftp the same file from my desktop to the lakes) the program runs again without problems.
It is often hard to estimate the duration of jobs, especially since they might be queued on the remote resource for a long period of time. However it is often best when submitting long-running jobs to generate a proxy that will be valid for a longer period of time as follows:
%> grid-proxy-init -hours 72
Or another number of hours that will cover the duration of the job. If the proxy still expires before the end of the job, this will not stop it. Simply re-generating a proxy (grid-proxy-init) will allow you to retrieve the results.
Clovis
First a warning: these are only recognised by our project. Don't expect the NGS or HPCx to honour them. Saying that, to get one first ssh to silica.esc.cam.ac.uk (if you haven't got a silica account then get in touch). Next run the command grid-cert-request. This will produce a file usercert_request.pem which you should email to me. It will also produce a userkey.pem file which is the corresponding private key. Keep this safe and don't email it. I will sign your request and send you back the correct usercert.pem to use for our CA.
Mark (03/06/2004)
Page maintained by Martin Dove
Last update