openmpi-1.10.2 missing liboshmem [and cuda support]
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
openmpi (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
A) Since ubuntu 16.04 has cuda-7.5 packages, openmpi 1.10.2 COULD be
configured --with-cuda. I need to do this, but I'm still running
ubuntu 14.04 with /usr/local/ cuda stuff so my mods are not quite correct
for ubuntu 16.04
B) liboshmem components were not being installed correctly. At least for
ubuntu 14.04, by the time debian/rules tests for the linux-only components
they have be transformed into symlinks, so the first change is
"if test -f FOO; then \" ---> "if test -f FOO -o -h FOO; then \"
Then, there should be a few additional "mkdir -p .../man1" lines
because those directories might not yet exist under debian/PKGNAME/ yet.
Ubuntu 16.04+ SHOULD fix the liboshmem install issues
Ubuntu 16.04+ MIGHT CONSIDER doing a --with-cuda configuration.
(Unfortunately --with-cuda might be best as a new/separate package, uggh)
oh, here's the "we also need" list:
1. My release 14.04, but doing backport from 16.04 debian/
2. openmpi1.10 and related packages from openmpi-1.10.2 sources in Xenial,
with backport mods for 14.04
3. backport should (at least) have installed liboshmem.so correctly
4. backport left dangling symlinks for liboshmem.so
This issue should also be present in upstream 16.04 LTS,
and is easy to fix.
Summary of my backport journey:
openmpi (1.10.2-8ubuntu4) UNRELEASED; urgency=medium
.
* --with-cuda, install so.1 --> .so symlinks for new cuda libs
* tweak library links, esp for libmca_
* fix liboshmem typo in rules file
* rules file should test for oshrun etc. being symlinks for liboshmem stuff
Can someone with a working linux openmpi on 16.04 check for dangling liboshmem.so symlinks and
verify the "minor" part of this bug report?
The --with-cuda "feature request" needs careful thought/discussion,
esp. if it is to go into 16.04, since it should be coordinated with the
cuda-7.5 packages.