This repository contains the code for preparing the data and training the zeroshot models described in the paper

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-04-03 10:00:04

This repository contains the code for preparing the data and training the zeroshot models described in the paper "Building Efficient Universal Classifiers with Natural Language Inference ". The models can be downloaded via my Hugging Face Zeroshot Classifiers Collection.

The model can do one universal classification task: determine whether a hypothesis is "true" or "not true" given a text (entailment vs. not_entailment). This task format is based on the Natural Language Inference task (NLI). The task is so universal that any classification task can be reformulated into this task.

Note that compared to other NLI models, these models predicts two classes (entailment vs. not_entailment) as opposed to three classes (entailment/neutral/contradiction)

The model was only trained on English data. For multilingual use-cases, I recommend machine translating texts to English with libraries like EasyNMT. English-only models tend to perform better than multilingual models and validation with English data can be easier if you don't speak all languages in your corpus.

Leave a Comment