Snowflake/snowflake-arctic-embed-m

sentence similaritysentence-transformerssentence-transformersonnxsafetensorsbertfeature-extractionsentence-similarityapache-2.0

162

HuggingFace

438.8K

Compute token embeddings

with torch.no_grad(): query_embeddings = model(**query_tokens)[0][:, 0] document_embeddings = model(**document_tokens)[0][:, 0]

normalize embeddings

query_embeddings = torch.nn.functional.normalize(query_embeddings, p=2, dim=1) document_embeddings = torch.nn.functional.normalize(document_embeddings, p=2, dim=1)

scores = torch.mm(query_embeddings, document_embeddings.transpose(0, 1)) for query, query_scores in zip(queries, scores): doc_score_pairs = list(zip(documents, query_scores)) doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True) #Output passages & scores print("Query:", query) for document, score in doc_score_pairs: print(score, document)


### Using Transformers.js

If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) by running:
```bash
npm i @xenova/transformers

You can then use the model to compute embeddings as follows:

import { pipeline, dot } from '@xenova/transformers';

// Create feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Snowflake/snowflake-arctic-embed-m', {
    quantized: false, // Comment out this line to use the quantized version
});

// Generate sentence embeddings
const sentences = [
    'Represent this sentence for searching relevant passages: Where can I get the best tacos?',
    'The Data Cloud!',
    'Mexico City of Course!',
]
const output = await extractor(sentences, { normalize: true, pooling: 'cls' });

// Compute similarity scores
const [source_embeddings, ...document_embeddings ] = output.tolist();
const similarities = document_embeddings.map(x => dot(source_embeddings, x));
console.log(similarities); // [0.15664823859882132, 0.24481869975470627]

Using Infinity

OpenAI compatible API deployment with Infinity and Docker.

docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
michaelf34/infinity:0.0.70 \
v2 --model-id Snowflake/snowflake-arctic-embed-m --dtype float16 --batch-size 32 --engine torch --port 7997

FAQ

TBD

Contact

Feel free to open an issue or pull request if you have any questions or suggestions about this project. You also can email Daniel Campos(daniel.campos@snowflake.com).

License

Arctic is licensed under the Apache-2. The released models can be used for commercial purposes free of charge.

Acknowledgement

We want to thank the open-source community, which has provided the great building blocks upon which we could make our models. We thank our modeling engineers, Danmei Xu, Luke Merrick, Gaurav Nuti, and Daniel Campos, for making these great models possible. We thank our leadership, Himabindu Pucha, Kelvin So, Vivek Raghunathan, and Sridhar Ramaswamy, for supporting this work. We also thank the open-source community for producing the great models we could build on top of and making these releases possible. Finally, we thank the researchers who created BEIR and MTEB benchmarks. It is largely thanks to their tireless work to define what better looks like that we could improve model performance.

Deploy Model on Runcrate

Run this model on powerful GPU infrastructure. Deploy in 60 seconds.

Pay per second

H100, A100, RTX GPUs

Instant deployment

DEPLOY IN 60 SECONDS

Run snowflake-arctic-embed-m on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.