|
TWiki . Simulation . SimulationResearchOnPizzeria
|
The "Pizzeria": CTIE's Linux Cluster
The systems are Sun LX50 and Sun Fire V65x.
Accounts generally created with your usual authcate username.
The system uses ITS linux kerberos password - so you will have to change it via www.its web pages for changing passwords for specific hosts and/or authcate.
The pizza cluster has a bit of a split personality at present with latest OS installed on:
marinara
margarita
supersupreme
supreme
hawaiian
mexicana
hotnspicy still running old fedoracore2 - not recommended for new work.
At present contact Daniel Grimm and he will create an account using your standard username and the kerberos password system. This allows you to use the same password on all systems.
For non-monash people (official visitors etc) an account will be created using a local password [which you will need to manually synchronise between all the machines].
(tba)
Matlab is designed as a single use system - it writes command history and the like to a .matlab directory - don't run it on two boxes from same home directory simultaneously (get another account on the cluster to run more jobs). Matlab via ssh/Xwindows graphics is slow, but the processing on the cluster is fast. Have a look in the /usr/local/bin directory for matlab as multiple versions are installed. (as of march06 matlab will run R14sp2 and you need to explicitly specify matlabR14sp3)
2007 update: /usr/local/matlab2007b/bin/matlab runs the latest release matlab.
More updates (2008 after boxes moved from B31 to B35):
- matlab on 2008 SL51 systems will run matlab2007b by default.
- matlab2008a on 2008 sl51 systems will run matlab2008a release.
To run matlab tasks without interactivity see RunMatlabNodisplay
ssh via putty or Xwin ssh linux session.
EG: ssh marg.ctie.monash.edu.au margarita.ctie.monash.edu.au
If your .m file access lots of data from disk (read/write) then make a directory in /local and copy your files there to run quicker. - It's a disk local to each system and quicker than using your home directory.
Quick Start with OMNET++
1. login to any of these:
* supreme.ctie.monash.edu.au
* hawaiian.ctie.monash.edu.au
* mexicana.ctie.monash.edu.au
* capriciosa.ctie.monash.edu.au
* supersupreme.ctie.monash.edu.au
* marinara.ctie.monash.edu.au
* margarita.ctie.monash.edu.au
* hotnspicy.ctie.monash.edu.au
2. mkdir ~/oppsim
3. cd oppsim
4. cp -R /usr/share/doc/omnetpp-2.3_20031009/samples/hcube .
5. cd hcube
6. opp_makemake -f
7. make
8. ./hcube
And see that hypercube simulation is running OK. Now, you can start building up your simulation models. I suggest you work in a new directory under ~/oppsim for your simulation work.
Note: ignore any messages about being unable to copy files with names like gensink_n.cc as they are created by the make compilations.
You can prepare a list of simulations to run and submit to the PBS (Portable Batch System) queue of the pizzeria. The PBS will monitor each host and will submit the jobs whenever it feels that a host can run one more process. Here is a simple example to try. We have a C++ program which accepts two natural numbers as command line arguments and checks how many prime numbers exist between these two numbers. Suppose that you have prepared such a list of commands. You can submit these jobs one by one to the PBS queue by calling the submit_pbs_jobs.pl script like this: cat exeList | ./submit_pbs_jobs.pl. OK. Enough said. Download the Makefile, sources and scripts and try yourself:
Please go to http://bc727.eng.monash.edu.au/userguide.html for more PBS script examples (this is engineering faculty's cluster).
Notes on PBS runs
- Be very careful about not filling the disk space!
- Use the storage area called
/short_term which is available from all the nodes of the pizzeria. It is currently a logical volume of 20GB, is not backed up, and each night all files not accessed in the last 7 days are deleted (empty directories are deleted too).
- You can check out the progress of the jobs with
qstat -a command (see the screenshot below). Note that some jobs are running, some are queued and waiting for their turn to come up. Also note that even though you see "supreme" in the job id's, this does not mean that they all are running on supreme. Supreme is only the head node that PBS manager runs on.
Quick N Dirty lam MPI start guide (from T Ramdas) -Dan thought it might be useful
The following extremely brief information is a rough quickstart for doing LAM/MPI work on the CTIE pizzeria cluster.
1. I boot LAM like this:
`lamboot -ssi boot-rsh-agent ssh lamhosts`
with ssh keys installed, and lamhosts (text)file looking something like this:
marinara
supersupreme
...
hotnspicy
Note that using short names for the computers is best - uses internal network for the communications.
2. I run stuff with mpirun.
3. I stop LAM like this:
`lamclean -v`
`lamhalt`
If anything breaks, you didn't learn the above from Tirath or this page;-)
----- Revision r1.22 - 22 Jul 2008 - 00:41 GMT - DanGrimm
|
Copyright © 1999-2003 by the contributing authors.
All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback.