Feature Extraction
sentence-transformers
PyTorch
ONNX
Safetensors
Transformers
Transformers.js
English
bert
fill-mask
sentence-similarity
mteb
custom_code
text-embeddings-inference
🇪🇺 Region: EU
Instructions to use jinaai/jina-embeddings-v2-base-code with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use jinaai/jina-embeddings-v2-base-code with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use jinaai/jina-embeddings-v2-base-code with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="jinaai/jina-embeddings-v2-base-code", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) model = AutoModelForMaskedLM.from_pretrained("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) - Transformers.js
How to use jinaai/jina-embeddings-v2-base-code with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'jinaai/jina-embeddings-v2-base-code'); - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ tags:
|
|
| 22 |
## Intended Usage & Model Info
|
| 23 |
|
| 24 |
`jina-embeddings-v2-base-code` is an multilingual **embedding model** speaks **English and 30 widely used programming languages**.
|
| 25 |
-
|
| 26 |
|
| 27 |
`jina-embeddings-v2-base-code` is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
|
| 28 |
The backbone `jina-bert-v2-base-code` is pretrained on the [github-code](https://huggingface.co/datasets/codeparrot/github-code) dataset.
|
|
|
|
| 22 |
## Intended Usage & Model Info
|
| 23 |
|
| 24 |
`jina-embeddings-v2-base-code` is an multilingual **embedding model** speaks **English and 30 widely used programming languages**.
|
| 25 |
+
Same as other jina-embeddings-v2 series, it supports **8192** sequence length.
|
| 26 |
|
| 27 |
`jina-embeddings-v2-base-code` is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
|
| 28 |
The backbone `jina-bert-v2-base-code` is pretrained on the [github-code](https://huggingface.co/datasets/codeparrot/github-code) dataset.
|