Azure Databricks – Part 2 – How to read Amazon DynmoDB table data using NoteBooks
In this blog, we will learn how to connect AWS DynmoDB and read the table data using Python script step by step.
Step 1: In the left pane, select Azure Databricks. From the Common Tasks, select New Notebook.
Step 2: In the Create Notebook dialog box, enter a name, select Python as the language, and select the Spark cluster that you created earlier.
Step 3: Once the Notebook creates you can write a python script to connect AWS DynmoDB using the boto3 client library. To connect AWS DynmoDB you must have an AWS access key ID and AWS secret access key.
Python script :
# Databricks notebook source
import boto3
import pandas as pd
session = boto3.session.Session(aws_access_key_id=’your AWS access key ID’,aws_secret_access_key=’your AWS secret access key’,region_name=’your region’)
dynamodb = session.resource(“dynamodb”)
table = dynamodb.Table(“Table Name”)
response = table.scan()
items = response[“Items”]
data = pd.DataFrame(items)
output = data.to_csv (index_label=”idx”, encoding = “utf-8”)
print(output)
Step 4: Now you can check the output by pressing the Shift + Enter key or click on the Run cell.
Hope this will help.