Create embeddings in SQL2025-John Liu Blog

SQL2025 has the ability to use local ONNX model on the server to create embeddings. Following are steps for configuration.

1. If this feature is still developer preview feature, run following SQL scrit to enable preview features

ALTER DATABASE SCOPED CONFIGURATION SET PREVIEW_FEATURES = ON;

2. Enable AI runtime on SQL2025

EXEC sp_execute 'external AI runtime enabled',1;
RECONFIGURE WITH OVERRIDE;

3. Set up the ONNX runtime library

Create a local folder, say C:\onnx_runtime, to store the runtime libarary files. Then download the ONNX runtime, and unzip and copy the onnxruntime.dll (located in the lib directory) to C:\onnx_runtime foder.

4. Set up the tokenization library

Download and build the tokenizers-cpp libarary from GitHub. Once the dll is created, copy the tokenizer into C:\onnx_runtime foder. Ensure the dll is named tokenizers_cpp.dll. Or directly downloaded the compiled dll from here.

5. Download the ONNX model

Create folder C:\onnx_runtime\model. Clone the all-MiniLM-L6-v2-onnx model from Hugging Face to the model folder. You can use any other model.

The all-MiniLM-L6-v2-onnx model is a sentence-transformers model that maps sentences and paragraphs to a 384-dimensional dense vector space. It has a max sequence length of 256 tokens.

cd C:\onnx_runtime\model
git clone https://huggingface.co/nsense/all-MiniLM-L6-v2-onnx

6. Set directory permission

Grant access to the ONNX runtime directory to MSSQLLaunchpad user, for default instance, it’s MSSQLLaunchpad. For named instance, it’s MSSQLLaunchpad$, for example MSSQLLaunchpad$SQLExp_2025.

$AIExtPath = "C:\onnx_runtime";
$Acl = Get-Acl -Path $AIExtPath
$AccessRule = New-Object System.Security.AccessControl.FileSystemAccessRule('MSSQLLaunchpad', "FullControl", "ContainerInherit,ObjectInherit", "None","Allow")
$Acl.AddAccessRule($AccessRule)
Set-Acl -Path $AIExtPath -AclObject $Acl

7. Create the external model

The “PARAMETERS” value used here is a placeholder needed for SQL Server 2025.

LOCATION should point to the directory containing model.onnx and tokenizer.json files.

LOCAL_RUNTIME_PATH should point to directory containing onnxruntime.dll and tokenizer_cpp.dll files.

CREATE EXTERNAL MODEL myLocalOnnxModel
WITH (
LOCATION = 'C:\onnx_runtime\model\all-MiniLM-L6-v2-onnx',
API_FORMAT = 'ONNX Runtime',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'allMiniLM',
PARAMETERS = '{"valid":"JSON"}',
LOCAL_RUNTIME_PATH = 'C:\onnx_runtime\'
);

8. Generate embeddings

Use the ai_generate_embeddings function to test the model:

SELECT ai_generate_embeddings (N'Test Text' USE MODEL myLocalOnnxModel);

9. For troubleshooting, we can enable telemetry (extended events).

CREATE EVENT SESSION newevt
ON SERVER
ADD EVENT ai_generate_embeddings_airuntime_trace
(
ACTION (sqlserver.sql_text, sqlserver.session_id)
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, TRACK_CAUSALITY = ON, STARTUP_STATE = OFF);
GO
ALTER EVENT SESSION newevt ON SERVER STATE = START;
GO

Using following SQL query to see the captured telemetry/events:

SELECT
event_data.value('(@name)[1]', 'varchar(100)') AS event_name,
event_data.value('(@timestamp)[1]', 'datetime2') AS [timestamp],
event_data.value('(data[@name="model_name"]/value)[1]', 'nvarchar(200)') AS model_name,
event_data.value('(data[@name="phase_name"]/value)[1]', 'nvarchar(100)') AS phase,
event_data.value('(data[@name="message"]/value)[1]', 'nvarchar(max)') AS message,
event_data.value('(data[@name="request_id"]/value)[1]', 'nvarchar(max)') AS session_id,
event_data.value('(data[@name="error_code"]/value)[1]', 'bigint') AS error_code
FROM (
SELECT CAST(target_data AS XML) AS target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON s.address = t.event_session_address
WHERE s.name = 'newevt'
AND t.target_name = 'ring_buffer'
) AS data
CROSS APPLY target_data.nodes('//RingBufferTarget/event') AS XEvent(event_data);

References:

Create embeddings in SQL Server 2025 RC0 with a local ONNX model on Windows
ONNX Runtime in .NET: Ship Models Without Python Servers

FEATURED TAGS

ai api automation availability availability sets availability zones aws vm azure azure automation runbook azure blob azure cosmos db azure data lake azure function app azure openai azure sign-in azure site recovery azure sql database azure sql db azure subscription azure vm base64 certificate change data capture change tracking chrome clr container cte data api builder data conversion data gateway database role database size date table dax db config derived table diagram direct query disk management disk space docker downtime dtc dynamic m parameter embedding encrypted connection excel excel online execution plan extended events external data fabric fabric capacity failover cluster fk geometry hierarchy httpwebrequest hugo hyper-v incognito mode index infrastructure inline tvf json kql lakehouse linked server live query statistics locking m machine learning machine learning model machine learning services master key mcp mdx memory memory grant mermaid mirrored sql server network network card network category onedrive onnx runtime openrowset p2v parquet performance polybase power automate power bi power bi report tricks power platform power query powershell printer public ip address pyspark python qgis qt designer query performance query plan query troubleshooting r regex replication route s3 schema design scripting self-signed certificate server role sharepoint snowflake software development sofware development spark sql sql agent sql availability group sql error sql failover cluster instance sql index sql openjson sql permission sql recovery sql script sql security sql server sql server admin sql server config sql statistics ssis ssisdb ssl ssl/tls error ssms table expression tempdb tips troubleshooting unicode view visual studio visual studio code vmware wait statistics wi-fi connection issue windows settings