16 24 44

Pulkit Mehta

pulkitmehtawork

AI & ML interests

None yet

Recent Activity

reacted to merve's post with 🚀 23 days ago

https://huggingface.co/deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages

updated a model 29 days ago

pulkitmehtawork/bart_summarizer

published a model 29 days ago

pulkitmehtawork/bart_summarizer

View all activity

Organizations

reacted to merve's post with 🚀 23 days ago

Post

6601

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

4 replies

updated a model 29 days ago

pulkitmehtawork/bart_summarizer

Updated 29 days ago

published a model 29 days ago

pulkitmehtawork/bart_summarizer

Updated 29 days ago

liked a model 4 months ago

PhysicsWallahAI/Aryabhata-1.0

Text Generation • 8B • Updated Aug 13 • 6.64k • 104

updated a model 5 months ago

pulkitmehtawork/sparse-distilbert-base-uncased-python-code-lightening

Feature Extraction • 67M • Updated Jul 4 • 5

published a model 5 months ago

pulkitmehtawork/sparse-distilbert-base-uncased-python-code-lightening

Feature Extraction • 67M • Updated Jul 4 • 5

liked a model 5 months ago

prithivida/Splade_PP_en_v1

Feature Extraction • Updated Jun 30 • 30.5k • 29

reacted to tomaarsen's post with 🔥 5 months ago

Post

3106

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module for asymmetric models & much more. Sparse + Dense = 🔥 hybrid search performance! Details:

1️⃣ Sparse Encoder Models
Brand new support for sparse embedding models that generate high-dimensional embeddings (30,000+ dims) where <1% are non-zero:

- Full SPLADE, Inference-free SPLADE, and CSR architecture support
- 4 new modules, 12 new losses, 9 new evaluators
- Integration with @elastic-co , @opensearch-project , @NAVER LABS Europe, @qdrant , @IBM , etc.
- Decode interpretable embeddings to understand token importance
- Hybrid search integration to get the best of both worlds

2️⃣ Enhanced Encode Methods & Multi-Processing
- Introduce encode_query & encode_document automatically use predefined prompts
- No more manual pool management - just pass device list directly to encode()
- Much cleaner and easier to use than the old multi-process approach

3️⃣ Router Module & Advanced Training
- Router module with different processing paths for queries vs documents
- Custom learning rates for different parameter groups
- Composite loss logging - see individual loss components
- Perfect for two-tower architectures

4️⃣ Comprehensive Documentation & Training
- New Training Overview, Loss Overview, API Reference docs
- 6 new training example documentation pages
- Full integration examples with major search engines
- Extensive blogpost on training sparse models

Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0

What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!

commented on Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 5 months ago

Great work . Best part is interpretability and speed .. @tomaarsen - I am planning to fine tune a model for text to code retrieval with below setup .. please guide if this setting seems fine for start or anything I can tune to do better .. Idea is to do decent on text to code and eval on (https://github.com/CoIR-team/coir)
Training dataset - claudios/code_search_net .. filter on Python code .. query is doc string of code and passage is code ... loss - SparseMultipleNegativesRankingLoss.. not able to think of decent dev evaluation .. shall I use SparseTripletEvaluator .. also , just query and positive passage is fine because I believe negative options will be all other data in that batch or we have to explicitly prepare data ( mine negative data ) .. please guide ..