SQL Query External Dta

Posted by John Liu on Wednesday, June 4, 2025

Read text file contents as a single value:

SELECT [BulkColumn]
FROM OPENROWSET(BULK 'C:\TEMP\test.json', SINGLE_CLOB) AS Contents

Read data from Excel:

Option1: Using OPENROWSET. Note: if your column has data more than 255 characters, try to format the column as Text. Also try to change following registry key TypeGuessRows to 0 (default is 8), under path HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel (for 64-bit machine) or HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel (for 32-bit machine). Restart of SQL instance required if change registry setting. Even with this it might still not work.

SELECT *
FROM OPENROWSET(
    'Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=C:\Temp\test.xlsx;HDR=YES;IMEX=1',
    'SELECT * FROM [YourWorksheetName$]'
);

Option 2: Using Python. To use Pyton in SQL to read Excel, you need to install openpyxl if not already installed.

cd "C:\Program Files\Microsoft SQL Server\MSSQL15.SQLDev_2019\PYTHON_SERVICES\Scripts"
pip install openpyxl
TRUNCATE TABLE Test;

DECLARE @ExcelBinary VARBINARY(MAX);

-- Read the file as binary
SELECT @ExcelBinary = BulkColumn
FROM OPENROWSET(BULK N'C:\Temp\test.xlsx', SINGLE_BLOB) AS ExcelFile;

-- Pass binary to Python script
INSERT INTO Test(C1, C2, C3)
EXEC sp_execute_external_script
    @language = N'Python',
    @script = N'
import pandas as pd
import io

# Convert input binary to in-memory stream
bytes_io = io.BytesIO(ExcelData)

# Read Excel file from in-memory stream
df = pd.read_excel(bytes_io, sheet_name=''Core Data Model'', engine="openpyxl")

# to handle any potential line feed in the Excel cell value, as Excel stores line feed in XML format values.
df = df.replace({
    ''_x000D_'': ''\r'',    # carriage return
    ''_x000A_'': ''\n'',    # line feed
    ''_x000D__x000A_'': ''\r\n''  # carriage return + line feed, rarely appears as combined
}, regex=True)

# Return DataFrame to SQL Server
OutputDataSet = df
',
    @params = N'@ExcelData varbinary(max)',
    @ExcelData = @ExcelBinary;