UniXcoder ONNX for Code Search
Converted by VibeAtlas - AI Context Optimization for Developers
This is Microsoft's UniXcoder converted to ONNX format for use with Transformers.js in browser and Node.js environments.
Why UniXcoder?
UniXcoder understands code semantically, not just as text:
- Trained on 6 programming languages (Python, Java, JavaScript, PHP, Ruby, Go)
- Understands AST structure and data flow
- 20-30% better code search accuracy vs generic embedding models
Quick Start
Transformers.js (Browser/Node.js)
import { pipeline } from '@huggingface/transformers';
const embedder = await pipeline(
'feature-extraction',
'sailesh27/unixcoder-base-onnx'
);
const code = `function authenticate(user) {
return user.isValid && user.hasPermission;
}`;
const embedding = await embedder(code, {
pooling: 'mean',
normalize: true
});
console.log(embedding.dims); // [1, 768]
Semantic Code Search
import { pipeline, cos_sim } from '@huggingface/transformers';
const embedder = await pipeline('feature-extraction', 'sailesh27/unixcoder-base-onnx');
// Index your code
const codeSnippets = [
'function login(user, pass) { ... }',
'function formatDate(date) { ... }',
'function validateEmail(email) { ... }'
];
const codeEmbeddings = await embedder(codeSnippets, { pooling: 'mean', normalize: true });
// Search with natural language
const query = 'user authentication';
const queryEmbedding = await embedder(query, { pooling: 'mean', normalize: true });
// Find most similar
const similarities = codeEmbeddings.tolist().map((emb, i) => ({
code: codeSnippets[i],
score: cos_sim(queryEmbedding.tolist()[0], emb)
}));
Technical Details
- Architecture: RoBERTa-based encoder
- Hidden Size: 768
- Max Sequence Length: 512 tokens
- Output Dimensions: 768
- ONNX Opset: 14
About VibeAtlas
VibeAtlas is the reliability infrastructure for AI coding:
- Reduce AI token costs by 40-60%
- Improve code search accuracy with semantic understanding
- Add governance guardrails to AI workflows
Links:
Citation
@misc{unixcoder-onnx-2025,
title={UniXcoder ONNX: Code Embeddings for JavaScript},
author={VibeAtlas Team},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/sailesh27/unixcoder-base-onnx}
}
Original UniXcoder Paper
@inproceedings{guo2022unixcoder,
title={UniXcoder: Unified Cross-Modal Pre-training for Code Representation},
author={Guo, Daya and Lu, Shuai and Duan, Nan and Wang, Yanlin and Zhou, Ming and Yin, Jian},
booktitle={ACL},
year={2022}
}
License
Apache 2.0 (same as original UniXcoder)
- Downloads last month
- 6
Model tree for sailesh27/unixcoder-base-onnx
Base model
microsoft/unixcoder-base