I’ve ported to sotabench several translation models for English-German and English-French language pairs from fairseq or based on fairseq repository. As the fairseq models are available on torch.hub the initial implementation of sotabench.py was quite straightforward. However, it wasn’t to efficient as the torch.hub example uses no batching, so I’ve added simple batching to get 40-50x speed up, at least for single models with small beam width in which case I could fit batch size=128 in memory.
The Local Joint Self-attention wasn’t available on torch.hub, but as it was based on fairseq it was quite easy (except for playing with splitted weights and browsing web for missing files) to add torch.hub support.
After all these efforts it’s really nice to see this speed-score trade-off plot of state-of-the-art models.
What do you think? Let me know if you need help in porting models to sotabench.