= Before you ask us to install software.. by Harry Mangalam v1.20 - Feb 12, 2019 :icons: //Harry Mangalam mailto:harry.mangalam@uci.edu[harry.mangalam@uci.edu] // this file is converted to the HTML via the command: // fileroot="/home/hjm/nacs/Installing_SW_on_HPC"; asciidoc -a icons -a toc2 -a toclevels=3 -b html5 -a numbered ${fileroot}.txt; scp ${fileroot}.html ${fileroot}.txt moo:~/public_html; ssh moo 'moofr=Installing_SW_on_HPC; cd public_html; scp ${moofr}.html ${moofr}.txt hmangala@hpcs:/data/hpc/www' // or in-place // fileroot="/data/hpc/www/HOWTO_Ask_a_question"; asciidoc -a icons -a toc2 -a numbered -b html5 ${fileroot}.txt // don't forget that the HTML equiv of '~' = '%7e' // asciidoc cheatsheet: http://powerman.name/doc/asciidoc // asciidoc user guide: http://www.methods.co.nz/asciidoc/userguide.html If you want us to install software for you on HPC, please be aware of the different kinds of software and how difficult they are to install. Ask yourself the following questions to see if it's worthwhile to install it. == First, do we already have it on HPC already? James Walker has written a script that will do this for you called *searchmodules*. Use it like this: -------------------------------------------------------------- $ searchmodules tensorflow Module to load Command or function anaconda/2-2.3.0 tensorflow (1.1.0) anaconda/2.7-4.3.1 tensorflow (1.2.0) anaconda/2.7-4.3.1 tensorflow-gpu (1.2.0) anaconda/3.5-2.4.0 tensorflow-gpu (0.12.1) anaconda/3.6-4.3.1 tensorflow (1.2.0) enthought_python/7.3.2 tensorflow (0.9.0) # NB: The search is case InSenSiTIve $ searchmodules petsc Module to load Command or function PETSc/3.3 PETSc/3.4 PETSc/3.4.4 PETSc/3.5.2 PETSc/3.5.3 PETSc/3.6.1 PETSc/3.7.5 PETSc-dev/3.5.2 PETSc-dev/3.6.3 PETSc-dev/3.7.3 -------------------------------------------------------------- // - *Modules:* Check our http://moo.nac.uci.edu/~hjm/biolinux/Linux_Tutorial_12.html#_modules[module system] // to see whether it exists already. You can also view a https://hpc.oit.uci.edu/all.modules[recent // listing of our modules here]. // - *Python-based:* check if the software you want has already been installed in one of our Python distributions: // * *Enthought Python:* 'module load enthought_python; pip list --format=columns | grep -i your-target' // * *Anaconda Python:* 'module load anaconda/; pip list --format=columns | grep -i your-target' // * *Generic Python:* 'module load python/; pip list --format=columns | grep -i your-target' // - *R-based:* the R interpreter has a https://stat.ethz.ch/R-manual/R-devel/library/utils/html/ installed.packages.html[number of ways to try to find the package]. // * 'module load R/' Then try the mechanisms described above. // - *Perl-based:* // * Try the Perl FAQ: http://perldoc.perl.org/perlfaq3.html#How-do-I-find-which-modules-are-installed-on-my-system%3f[How do I find which modules are installed on my system?] // * See also, the included Perldocs: 'module load perl/; perldoc -q installed' == How complex or difficult is it to install? There are several types of software, at least from our point of view, and each of them can be more or less difficult to install. - https://en.wikipedia.org/wiki/Compiler[Compiled software] (C/C++, Fortran, Go, etc) * sometimes simple, but almost always more complicated than an interpreted language. * sometimes downloadable in a binary executable package, but often have http://ftp.ntu.edu.tw/software/libs/glibc/hjl/compat/[GLIBC dependency issues]. * sometimes lots of dependencies in terms of libraries, versions, etc. - https://en.wikipedia.org/wiki/Interpreter_(computing)[Interpreted Software] (Python, Perl, R, Java, Julia, Ruby, etc) * almost always very easy to install. no compilation needed, just download and run. * however, often need to install them in conjunction with their parent packages - Mixed packages (R and Python packages are interpreted but often include compiled routines) * often in a scripted build routine such that they can be easily installed. * typically easier to install than pure compiled packages - Graphical vs Commandline. Graphical User Interface (GUI) apps have additional dependencies that often make them more difficult to install, but often that difficulty in installing is more than paid off by the ease of use. However GUI software cannot run in batch mode unless the GUI is optional or it is designed to be run headless (MATLAB/Octave or R vs RStudio) - Parallel vs Serial * software varies in difficulty, but is almost always harder to install than serial applications due to the additional requirements of https://en.wikipedia.org/wiki/Message_Passing_Interface[MPI] or other support libraries and their dependencies. - Distribution mechanism. * Is the software distributed as part of a Linux repository? If so, it's easy to install to one machine, but is storage-wasteful to install to all machines. It's usually possible (but time-consuming) to unpack it into its components and install into an alternative path. * Is it a regular release of a popular package? If it's a Python or R package that's designed to be easily installed with the usual mechanisms like 'pip' or 'setup.py' or 'R CMD INSTALL', then it's much easier for us than a hairball installation. == Widespread use or utility - Will it run on HPC? ie: It must be written to run on Linux (and preferably the CentOS distribution that HPC uses). If necessary, it should be run without a GUI if it needs to be run on multiple nodes via the scheduler. - Can you run it on HPC the way it needs to run? If the package was designed to run in a special, stand-alone system with a dedicated web frontend under a single user with no security, it's probably not going to co-exist well on HPC. - What is the maturity of the software? How long has this package been out? Has it just been published? Has it been validated by others? - How widely usable is it? What other labs or users could make use of this software? If it's something that only you can use, perhaps you should try to install it yourself. - Will it need to be installed repeatedly? - Ease of Compilation * How easy is it to build? Does it use the standard GNU autoconf/configure/make? Or does it use some wild-haired self-writ build files with hardcoded paths, variables, and a tangle of platform-specific code? * How many dependencies does it require? Is it a stand-alone program or does it require a long list of == Install your own software Please see the doc "How to install your own software on HPC" [in progress - anyone want to write this?] == If you still want us to install it Please make the request to 'hpc-support@uci.edu' including the following information: - the particular version you want. Don't just say 'the latest version' - there may be various 'latest' versions depending on distribution, repository, etc. - provide a link to the specific download site or download the version you want us to install and include the path to that archive == Commercial Software If it is commercial software, please contact us 1st for a quick sanity check. - We may decline to install it if there are security issues, unacceptable constraints on use (node-locking, for example), etc. - We can't install software that require Web services since we're separating all web services from HPC. - If the software requires significant customization of our scheduler (or a different scheduler) we may not be able to make it work. - if the software expects or requires hardware graphics (a dedicated graphics card), it may be pointless to install it since we have no direct-connected graphics workstation. (If it requires Nvidia GPUs for (nonvisual) computational processing, we can probably accommodate this). - It's YOUR responsibility to find out if it can be licensed for a cluster, what constraints there are on the software, how much it costs, registration, and payment for the software. - It is also YOUR responsibility to register the software, download it, and place it on HPC, and then inform us of the location of the archive and any licensing terms and files.