Read text file contents as a single value:
SELECT [BulkColumn]
FROM OPENROWSET(BULK 'C:\TEMP\test.json', SINGLE_CLOB) AS Contents
Read data from Excel:
Option1: Using OPENROWSET. Note: if your column has data more than 255 characters, try to format the column as Text. Also try to change following registry key TypeGuessRows to 0 (default is 8), under path HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel (for 64-bit machine) or HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel (for 32-bit machine). Restart of SQL instance required if change registry setting. Even with this it might still not work.
SELECT *
FROM OPENROWSET(
'Microsoft.ACE.OLEDB.12.0',
'Excel 12.0;Database=C:\Temp\test.xlsx;HDR=YES;IMEX=1',
'SELECT * FROM [YourWorksheetName$]'
);
Option 2: Using Python. To use Pyton in SQL to read Excel, you need to install openpyxl if not already installed.
cd "C:\Program Files\Microsoft SQL Server\MSSQL15.SQLDev_2019\PYTHON_SERVICES\Scripts"
pip install openpyxl
TRUNCATE TABLE Test;
DECLARE @ExcelBinary VARBINARY(MAX);
-- Read the file as binary
SELECT @ExcelBinary = BulkColumn
FROM OPENROWSET(BULK N'C:\Temp\test.xlsx', SINGLE_BLOB) AS ExcelFile;
-- Pass binary to Python script
INSERT INTO Test(C1, C2, C3)
EXEC sp_execute_external_script
@language = N'Python',
@script = N'
import pandas as pd
import io
# Convert input binary to in-memory stream
bytes_io = io.BytesIO(ExcelData)
# Read Excel file from in-memory stream
df = pd.read_excel(bytes_io, sheet_name=''Core Data Model'', engine="openpyxl")
# to handle any potential line feed in the Excel cell value, as Excel stores line feed in XML format values.
df = df.replace({
''_x000D_'': ''\r'', # carriage return
''_x000A_'': ''\n'', # line feed
''_x000D__x000A_'': ''\r\n'' # carriage return + line feed, rarely appears as combined
}, regex=True)
# Return DataFrame to SQL Server
OutputDataSet = df
',
@params = N'@ExcelData varbinary(max)',
@ExcelData = @ExcelBinary;