Usage with infinity:
[Infinity](https://github.com/michaelfeil/infinity), a MIT Licensed Inference RestAPI Server.
docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997"
michaelf34/infinity:0.0.68
v2 --model-id Alibaba-NLP/gte-multilingual-reranker-base --revision "main" --dtype bfloat16 --batch-size 32 --device cuda --engine torch --port 7997
Usage with [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference):
- CPU:
```bash
docker run --platform linux/amd64 \
-p 8080:80 \
-v $PWD/data:/data \
--pull always \
ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \
--model-id Alibaba-NLP/gte-multilingual-reranker-base
docker run --gpus all \
-p 8080:80 \
-v $PWD/data:/data \
--pull always \
ghcr.io/huggingface/text-embeddings-inference:1.7 \
--model-id Alibaba-NLP/gte-multilingual-reranker-base
Then you can send requests to the deployed API via the /rerank route (see the Text Embeddings Inference OpenAPI Specification for more details):
curl https://0.0.0.0:8080/rerank \
-H "Content-Type: application/json" \
-d '{
"query": "中国的首都在哪儿",
"raw_scores": false,
"return_text": false,
"texts": [ "北京" ],
"truncate": true,
"truncation_direction": "right"
}'
Results of reranking based on multiple text retreival datasets

More detailed experimental results can be found in the paper.
In addition to the open-source GTE series models, GTE series models are also available as commercial API services on Alibaba Cloud.
Note that the models behind the commercial APIs are not entirely identical to the open-source models.
If you find our paper or models helpful, please consider cite:
@inproceedings{zhang2024mgte,
title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval},
author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others},
booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
pages={1393--1412},
year={2024}
}