ONNX (Open Neural Network Exchange)
An open standard that represents machine learning models as portable computational graphs, enabling deployment across different frameworks and hardware without retraining. ONNX Runtime, the primary execution engine, optimizes inference through hardware specific execution providers supporting CPU, GPU and specialized accelerators like NPUs. Exporting embedding models to ONNX is a common production optimization, often reducing inference latency significantly compared to native framework execution.
Out-of-Domain
Data or evaluation scenarios that differ from a model's training distribution. Out-of-domain performance is a stronger measure of generalization and more representative of real-world usage, where queries, documents and content types are rarely anticipated during training. A model that degrades significantly out-of-domain is likely overfitting to its training distribution.