SQL2025 has the ability to use local ONNX model on the server to create embeddings. Following are steps for configuration.
1. If this feature is still developer preview feature, run following SQL scrit to enable preview features
ALTER DATABASE SCOPED CONFIGURATION SET PREVIEW_FEATURES = ON;
2. Enable AI runtime on SQL2025
EXEC sp_execute 'external AI runtime enabled',1;
RECONFIGURE WITH OVERRIDE;
3. Set up the ONNX runtime library
Create a local folder, say C:\onnx_runtime, to store the runtime libarary files. Then download the ONNX runtime, and unzip and copy the onnxruntime.dll (located in the lib directory) to C:\onnx_runtime foder.
4. Set up the tokenization library
Download and build the tokenizers-cpp libarary from GitHub. Once the dll is created, copy the tokenizer into C:\onnx_runtime foder. Ensure the dll is named tokenizers_cpp.dll. Or directly downloaded the compiled dll from here.
5. Download the ONNX model
Create folder C:\onnx_runtime\model. Clone the all-MiniLM-L6-v2-onnx model from Hugging Face to the model folder. You can use any other model.
The all-MiniLM-L6-v2-onnx model is a sentence-transformers model that maps sentences and paragraphs to a 384-dimensional dense vector space. It has a max sequence length of 256 tokens.
cd C:\onnx_runtime\model
git clone https://huggingface.co/nsense/all-MiniLM-L6-v2-onnx
6. Set directory permission
Grant access to the ONNX runtime directory to MSSQLLaunchpad user, for default instance, it’s MSSQLLaunchpad. For named instance, it’s MSSQLLaunchpad$, for example MSSQLLaunchpad$SQLExp_2025.
$AIExtPath = "C:\onnx_runtime";
$Acl = Get-Acl -Path $AIExtPath
$AccessRule = New-Object System.Security.AccessControl.FileSystemAccessRule('MSSQLLaunchpad', "FullControl", "ContainerInherit,ObjectInherit", "None","Allow")
$Acl.AddAccessRule($AccessRule)
Set-Acl -Path $AIExtPath -AclObject $Acl
7. Create the external model
The “PARAMETERS” value used here is a placeholder needed for SQL Server 2025.
LOCATION should point to the directory containing model.onnx and tokenizer.json files.
LOCAL_RUNTIME_PATH should point to directory containing onnxruntime.dll and tokenizer_cpp.dll files.
CREATE EXTERNAL MODEL myLocalOnnxModel
WITH (
LOCATION = 'C:\onnx_runtime\model\all-MiniLM-L6-v2-onnx',
API_FORMAT = 'ONNX Runtime',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'allMiniLM',
PARAMETERS = '{"valid":"JSON"}',
LOCAL_RUNTIME_PATH = 'C:\onnx_runtime\'
);
8. Generate embeddings
Use the ai_generate_embeddings function to test the model:
SELECT ai_generate_embeddings (N'Test Text' USE MODEL myLocalOnnxModel);
9. For troubleshooting, we can enable telemetry (extended events).
CREATE EVENT SESSION newevt
ON SERVER
ADD EVENT ai_generate_embeddings_airuntime_trace
(
ACTION (sqlserver.sql_text, sqlserver.session_id)
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, TRACK_CAUSALITY = ON, STARTUP_STATE = OFF);
GO
ALTER EVENT SESSION newevt ON SERVER STATE = START;
GO
Using following SQL query to see the captured telemetry/events:
SELECT
event_data.value('(@name)[1]', 'varchar(100)') AS event_name,
event_data.value('(@timestamp)[1]', 'datetime2') AS [timestamp],
event_data.value('(data[@name="model_name"]/value)[1]', 'nvarchar(200)') AS model_name,
event_data.value('(data[@name="phase_name"]/value)[1]', 'nvarchar(100)') AS phase,
event_data.value('(data[@name="message"]/value)[1]', 'nvarchar(max)') AS message,
event_data.value('(data[@name="request_id"]/value)[1]', 'nvarchar(max)') AS session_id,
event_data.value('(data[@name="error_code"]/value)[1]', 'bigint') AS error_code
FROM (
SELECT CAST(target_data AS XML) AS target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON s.address = t.event_session_address
WHERE s.name = 'newevt'
AND t.target_name = 'ring_buffer'
) AS data
CROSS APPLY target_data.nodes('//RingBufferTarget/event') AS XEvent(event_data);
References:
Create embeddings in SQL Server 2025 RC0 with a local ONNX model on Windows
ONNX Runtime in .NET: Ship Models Without Python Servers