vi_sts = load_dataset("doanhieung/vi-stsbenchmark")["train"] df_dev = vi_sts.filter(lambda example: example['split'] == 'dev') df_test = vi_sts.filter(lambda example: example['split'] == 'test')
dev_samples = convert_dataset(df_dev) val_evaluator = EmbeddingSimilarityEvaluator.from_input_examples(dev_samples, name='sts-dev') val_evaluator(model, output_path="./")
test_samples = convert_dataset(df_test) test_evaluator = EmbeddingSimilarityEvaluator.from_input_examples(test_samples, name='sts-test') test_evaluator(model, output_path="./")
### Metric for all dataset of [Semantic Textual Similarity on STS Benchmark](https://huggingface.co/datasets/anti-ai/ViSTS)
**Spearman score**
| Model | [STSB] | [STS12]| [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean |
|-----------------------------------------------------------|---------|----------|----------|----------|----------|----------|---------|--------|
| [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding) |84.84| 79.04| 85.30| 81.38| 87.06| 79.95| 79.58| 82.45|
| [dangvantuan/vietnamese-embedding-LongContext](https://huggingface.co/dangvantuan/vietnamese-embedding-LongContext) |85.25| 75.77| 83.82| 81.69| 88.48| 81.5| 78.2| 82.10|
## Citation
```bibtex
@article{reimers2019sentence,
title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks}
, author={Nils Reimers, Iryna Gurevych}, journal={https://arxiv.org/abs/1908.10084}, year={2019} }
@article{zhang2024mgte,
title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval}
, author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others}, journal={arXiv preprint arXiv:2407.19669}, year={2024} }
@article{li2023towards,
title={Towards general text embeddings with multi-stage contrastive learning}
, author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan}, journal={arXiv preprint arXiv:2308.03281}, year={2023} }
@article{li20242d,
title={2d matryoshka sentence embeddings}
, author={Li, Xianming and Li, Zongxi and Li, Jing and Xie, Haoran and Li, Qing}, journal={arXiv preprint arXiv:2402.14776}, year={2024} }