Azure Databricks – How to read CSV file from blob storage and push the data into a synapse SQL pool table

In this blog, we will learn how to read CSV file from blob storage and push data into a synapse SQL pool table using Azure Databricks python script. In part1 we created an Azure synapse analytics workspace, dedicated SQL pool in this we have seen how to create a dedicated SQL pool. In this blog, we will use a JDBC connection string to connect the SQL pool.

Step 1: Sign to the Azure portal. Open Azure Databricks and click on lunch workspace to create a new Notebook.

Step 2: Once the Azure Databricks Studio opens click on New Notebook and select your language, here I have selected “Python” language.

Step 3: Add the following code to connect your dedicated SQL pool using the JDBC connection string and push the data into a table.

Python script :

from azure.storage.blob import BlobServiceClient

import pandas as pd

import io

import pyspark.sql

storage_account_name = ‘Your Storage account name’

storage_account_access_key = ‘Your Storage account access key’

spark.conf.set(‘fs.azure.account.key.’ + storage_account_name + ‘.blob.core.windows.net’, storage_account_access_key)

blob_container = ‘Your container name’

filePath = “wasbs://” + blob_container + “@” + storage_account_name + “.blob.core.windows.net/Your CSV file name”

empDf = spark.read.format(“csv”).load(filePath, inferSchema = True, header = True)

connectionString=”Your JDSB connection sting;encrypt=true;trustServerCertificate=false;rewriteBatchedStatements=true;loginTimeout=30;”

empDf.write.jdbc(connectionString,”[dbo].[Employee]”, mode=”append”)

Step 4: You can get the JDBC connection string >> First open Synapse work space on the left pane in Analytics pools open SQL pool. Select your SQL pool, in overview you can find the link “Show database connection strings” and clicked on JDBC tab and copy the connection string.


Step 5: Once your script is ready, click on the Run All button at the top.
Step 6: Once you click on Run All and if you get an error like “ModuleNotFoundError: No module named ‘azure’”. Run the following command on separate command.

Step 7: Now click on the Run All button to execute the main script. Your script will execute successfully and also check in the SQL table.

Hope this will help.


Share Story :

SEARCH BLOGS :

FOLLOW CLOUDFRONTS BLOG :


Secured By miniOrange