Category Archives: Azure Databricks
Struggling with Siloed Systems? Here’s How CloudFronts Gets You Connected
In today’s world, we use many different applications for our daily work. One single application can’t handle everything because some apps are designed for specific tasks. That’s why organizations use multiple applications, which often leads to data being stored separately or in isolation. In this blog, we’ll take you on a journey from siloed systems to connected systems through a customer success story. About BÜCHI Büchi Labortechnik AG is a Swiss company renowned for providing laboratory and industrial solutions for R&D, quality control, and production. Founded in 1939, Büchi specializes in technologies such as: Their equipment is widely used in pharmaceuticals, chemicals, food & beverage, and academia for sample preparation, formulation, and analysis. Büchi is known for its precision, innovation, and strong customer support worldwide. Systems Used by BÜCHI To streamline operations and ensure seamless collaboration, BÜCHI leverages a variety of enterprise systems: Infor and SAP Business One are utilized for managing critical business functions such as finance, supply chain, manufacturing, and inventory. Reporting Challenges Due to Siloed Systems Organizations often rely on multiple disconnected systems across departments — such as ERP, CRM, marketing platforms, spreadsheets, and legacy tools. These siloed systems result in: The Need for a Single Source of Truth To solve these challenges, it’s critical to establish a Single Source of Truth (SSOT) — a central, trusted data platform where all key business data is: How We Helped Büchi Connect Their Systems To build a seamless and scalable integration framework, we leveraged the following Azure services: >Azure Logic Apps – Enabled no-code/low-code automation for integrating applications quickly and efficiently. >Azure Functions – Provided serverless computing for lightweight data transformations and custom logic execution. >Azure Service Bus – Ensured reliable, asynchronous communication between systems with FIFO message processing and decoupling of sender/receiver availability. >Azure API Management (APIM) – Secured and simplified access to backend services by exposing only required APIs, enforcing policies like authentication and rate limiting, and unifying multiple APIs under a single endpoint. BÜCHI’s case study was published on the Microsoft website, highlighting how CloudFronts helped connect their systems and prepare their data for insights and AI-driven solutions. Why a Single Source of Truth (SSOT) Is Important A Single Source of Truth means having one trusted location where your business stores consistent, accurate, and up-to-date data. Key Reasons It Matters: How we did this We used Azure Function Apps, Service Bus, and Logic Apps to seamlessly connect the systems. Databricks was implemented to build a Unity Catalog, establishing a Single Source of Truth (SSOT). On top of this unified data layer, we enabled advanced analytics and reporting using Power BI. In May, we hosted an event with BÜCHI at the Microsoft Office in Zurich. During the session, one of the attending customers remarked, “We are five years behind BÜCHI.” Another added, “If we don’t start now, we’ll be out of the race in the future.” This clearly reflects the urgent need for businesses to evolve. Today, Connected Systems, a Single Source of Truth (SSOT), Advanced Analytics, and AI are not optional — they are essential for sustainable growth and improved human efficiency. The pace of transformation has accelerated: tasks that once took months can now be achieved in days — and soon, perhaps, with just a prompt. To conclude, if you’re operating with multiple disconnected systems and relying heavily on manual processes, it’s time to rethink your approach. System integration and automation free your teams from repetitive work and empower them to focus on high impact, strategic activities. We hope you found this blog useful, and if you would like to discuss anything, you can reach out to us at transform@cloudfonts.com.
Share Story :
Azure Databricks – How to read CSV file from blob storage and push the data into a synapse SQL pool table
In this blog, we will learn how to read CSV file from blob storage and push data into a synapse SQL pool table using Azure Databricks python script. In part1 we created an Azure synapse analytics workspace, dedicated SQL pool in this we have seen how to create a dedicated SQL pool. In this blog, we will use a JDBC connection string to connect the SQL pool. Step 1: Sign to the Azure portal. Open Azure Databricks and click on lunch workspace to create a new Notebook. Step 2: Once the Azure Databricks Studio opens click on New Notebook and select your language, here I have selected “Python” language. Step 3: Add the following code to connect your dedicated SQL pool using the JDBC connection string and push the data into a table. Python script : from azure.storage.blob import BlobServiceClient import pandas as pd import io import pyspark.sql storage_account_name = ‘Your Storage account name’ storage_account_access_key = ‘Your Storage account access key’ spark.conf.set(‘fs.azure.account.key.’ + storage_account_name + ‘.blob.core.windows.net’, storage_account_access_key) blob_container = ‘Your container name’ filePath = “wasbs://” + blob_container + “@” + storage_account_name + “.blob.core.windows.net/Your CSV file name” empDf = spark.read.format(“csv”).load(filePath, inferSchema = True, header = True) connectionString=”Your JDSB connection sting;encrypt=true;trustServerCertificate=false;rewriteBatchedStatements=true;loginTimeout=30;” empDf.write.jdbc(connectionString,”[dbo].[Employee]”, mode=”append”) Step 4: You can get the JDBC connection string >> First open Synapse work space on the left pane in Analytics pools open SQL pool. Select your SQL pool, in overview you can find the link “Show database connection strings” and clicked on JDBC tab and copy the connection string. Step 7: Now click on the Run All button to execute the main script. Your script will execute successfully and also check in the SQL table. Hope this will help.
Share Story :
Azure Databricks – Part 1 – How to create Azure Databricks workspace and a Spark Cluster?
In this blog, we will learn how to create Azure Databricks workspace and a Spark Cluster step by step using the Azure portal. Create Azure Databricks workspace: Step 1: To create Azure Databricks workspace, sign in to the Azure portal. In the upper-left corner of the home page, select Create a resource. In the Search, the Marketplace box, enter Azure Databricks and select and press enter Step 2: Select Azure Databricks from the search result and click on the create button. Step 3: Click on the create button and enter the following information Subscription Resource group Workspace name Region Pricing tier Step 4: Click the Review + create tab before click on the create button. Once you click on the create button it will take 3 to 4 minutes to create a resource. Create a Spark Cluster in Azure Databricks: Step 1: In the Azure portal, go to the Databricks workspace that you created, and then click Launch Workspace Step 2: You are redirected to the Azure Databricks portal. From the portal, click New Cluster. Step 3: In the New cluster page, provide the values to create a cluster. Hope this will help.
Share Story :
Azure Databricks – Part 2 – How to read Amazon DynmoDB table data using NoteBooks
In this blog, we will learn how to connect AWS DynmoDB and read the table data using Python script step by step. Step 1: In the left pane, select Azure Databricks. From the Common Tasks, select New Notebook. Step 2: In the Create Notebook dialog box, enter a name, select Python as the language, and select the Spark cluster that you created earlier. Step 3: Once the Notebook creates you can write a python script to connect AWS DynmoDB using the boto3 client library. To connect AWS DynmoDB you must have an AWS access key ID and AWS secret access key. Python script : # Databricks notebook source import boto3 import pandas as pd session = boto3.session.Session(aws_access_key_id=’your AWS access key ID’,aws_secret_access_key=’your AWS secret access key’,region_name=’your region’) dynamodb = session.resource(“dynamodb”) table = dynamodb.Table(“Table Name”) response = table.scan() items = response[“Items”] data = pd.DataFrame(items) output = data.to_csv (index_label=”idx”, encoding = “utf-8”) print(output) Step 4: Now you can check the output by pressing the Shift + Enter key or click on the Run cell. Hope this will help.
