Helsinki-NLP throws StopIteration within DataParallel
Hoping this is the correct forum for this:
Python 3.12.3 torch==2.10.0 transformers=5.3.0
My aim is to use Helsinki-NLP/opus-mt-ru-en and Helsinki-NLP/opus-mt-zh-en with two GPUs in batches, which I’ve been able to do for a couple years until I upgraded to Ubuntu 24, hence Python 3.12.
It now fails with a StopIteration exception within the Module. For instance, at this line in the Module in the forward method:
outputs = self.model.generate(**inputs)
If, at that point any attempt to interrogate self.model.device, which is what happens downstream at line 2502 within GenerateMixin.generate() in transforers/generation/utils.py, the exception is thrown.
Dr. Google recommends converting to using DistributedDataParallel, but that seems to me a good bit of work for a process running on a single machine without any guarantee of sucess.
I may have to resort to hand-jamming threads, which I had to do for the ModernBERT models some time back.
Discussion in the ATmosphere