Reproduce

PyTorch Geometric

Traditionally, these libraries don’t do what they call “vendoring”, i.e. they expect you to have CUDA installed on your system.

In our case, PyTorch Geometric doesn’t vendor CUDA, but PyTorch vendors its own CUDA. so when you install pytorch 1.11 with cuda 11.3, it means that pytorch calls its own cuda 11.3, not the system-wide one. In the case of Great Lakes, the system-wide one is 11.8.

(pytorch does this b/c it’s a lot easier to install for users, especially researchers who are not as familiar with computer systems as engineers.)

but notice that pytorch geometric doesn’t vendor, so it calls the system wide one.

now pytorch geometric still expects a certain version of CUDA installed on the system. That’s why if you go to their website, it’ll tell you how to install versions of pytorch geometric that use different versions of CUDA.

now it’s most likely that the segfault error you’re seeing is coming from the fact that pytorch produces some tensors using one version of cuda, but pytorch geometric is processing those tensors using another version of cuda.

so you need to ensure that pytorch uses the same version of cuda as pytorch geometric.

the system wide CUDA just has to be compatible with the version pytorch geometric is targeted to.

alright, i just said a lot of things. do you follow? lol

Make sure

PyG Installation

Using pip command: pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-{torch_version}+cu{cuda_version}.html

Important: Sometimes we need to assign specific torch-sparse, torch-scatter version from pyg wheel website!


Reference My mentor Yu