Category Archives: Thought Leadership Article
Inside SmartPitch: How CloudFronts Built an Enterprise-Grade AI Sales Agent Using Microsoft and Databricks Technologies
Why SmartPitch? ā The Idea and Pain Point The idea for SmartPitch came directly from observing the day-to-day struggles of sales and pre-sales teams. Every Marketing Qualified Lead (MQL) to Sales Qualified Lead (SQL) conversion required hours of manual work: searching through documents stored in SharePoint, combing through case studies, aligning them with solution areas, and finally packaging them into a client-ready pitch deck. The reality was that documents across systemsāSharePoint, Dynamics 365, PDFs, PPTsāremained underutilized because there was no intelligent way to bring them together. Sales teams often relied on tribal knowledge or reused existing decks with limited personalization. We asked: What if a sales assistant could automatically pull the right case studies, map them to solution areas, and draft an elevator pitch on demand, in minutes? That became the SmartPitch vision: an AI-powered agent that: As a result of this product, it has helped us reduce pitch creation time by 70%. 2. The First Prototype ā Custom Copilot Studio Our first step was to build SmartPitch using Custom Copilot Studio. It gave us a low-code way to experiment with conversational flows, integrate with Azure AI Search, and provide sales teams with a chat interface. 1. Knowledge Sources Integration 2. Data Flow 3. Conversational Flow Design 4. Integration and Security 5. Technical Stack 6. Business Process Enablement 7. Early Prototypes With Custom Copilot, we were able to: We successfully demoed these early prototypes in Zurich and New York. They showed that the idea worked but they also revealed serious limitations. 3. Challenges in Custom Copilot Despite proving the concept, Custom Copilot Studio had critical shortcomings: Lacked support for model fine-tuning or advanced RAG customization. However, incorporating complex external APIs or custom workflows was difficult. This limitation meant SmartPitch, in its Copilot form, couldnāt scale to meet enterprise standards. 4. Rebuilding in Azure AI Foundry ā Smarter, Extensible, Connected The next phase was Azure AI Foundry, Microsoftās enterprise AI development platform. Unlike Copilot Studio, AI Foundry gave us: Extending SmartPitch with Logic Apps One of the biggest upgrades was the ability to integrate Azure Logic Apps as external tools for the agent. This allowed SmartPitch to: This modular approach meant we could add new functionality simply by publishing a new Logic App. No redeployment of SmartPitch was required. Automating Document Vectorization We also solved one of the biggest bottlenecksādocument ingestion and retrievalāby building a pipeline for automatic document vectorization from SharePoint: This allowed SmartPitch to search across text, images, tables, and PDFs, providing relevant answers instead of keyword matches. But There Were Limitations Even with these improvements, we hit roadblocks: At this point, we realized the true bottleneck wasnāt the agent itself, it was the quality of the data powering it. 5. Bad Data, Governance, and the Medallion Architecture SmartPitchās performance was only as good as the data it retrieved from. And much of the enterprise data was dirty: duplicate case studies, outdated documents, inconsistent file formats. This led to irrelevant or misleading responses in pitches. To address this, we turned to Databricksā Unity Catalog and Medallion Architecture: You can read our post on building a clean data foundation with Medallion Architecture [Link] Now, every result SmartPitch surfaced could be trusted, audited, and tied to a governed source. 6. SmartPitch in Mosaic AI ā The Final Evolution The last stage was migrating SmartPitch into Databricks Mosaic AI, part of the Lakehouse AI platform. This was where SmartPitch matured into an enterprise-grade solution. What We Gained in Mosaic AI: In Mosaic AI, SmartPitch wasnāt just a chatbot it became a data-native enterprise sales assistant: From these, we came to know the following differences between agent development in AI Foundry & DataBricks Mosaic AI – Attribute / Aspect Azure AI Foundry Mosaic AI Focus Developer and Data Scientist Data Engineers, Analysts, and Data Scientists Core Use Case Create and manage your own AI agent Build, experiment, and deploy data-driven AI models with analytics + AI workflows Interface Code-first (SDKs, REST APIs, Notebooks) No-code/low-code UI + Notebooks + APIs Data Access Azure Blob, Data Lake, vector DBs Native integration with Databricks Lakehouse, Delta Lake, Unity Catalog, vector DBs MCP Server Only custom MCP servers supported; built-in option complex Native MCP support with Databricks ecosystem; simpler setup Models 90 models available Access to open-source + foundation models (MPT, Llama, Mixtral, etc.) + partner models Model Customization Full model fine-tuning, prompt engineering, RAG Fine-tuning, instruction tuning, RAG, model orchestration Publish to Channels Complex (Azure Bot SDK + Bot Framework + App Service) Direct integration with Databricks workflows, APIs, dashboards, and third-party apps Agent Update Real-time updates in Microsoft Teams Updates deployed via Databricks workflows; versioning and rollback supported Key Capabilities Prompt flow orchestration, RAG, model choice, vector search, CICD pipelines, Azure ML & responsible AI integration Data + AI unification (native to Lakehouse), RAG with Lakehouse data, multi-model orchestration, fine-tuning, end-to-end ML pipelines, secure governance via Unity Catalog, real-time deployment Key Components Workspace & agent orchestration, 90+ models, OpenAI pay-as-you-go or self-hosted, security via Azure identity Mosaic AI Agent Framework, Model Serving, Fine-Tuning, Vector Search, RAG Studio, Evaluation & Monitoring, Unity Catalog Integration Cost / License Vector DB: external, Model Serving: token-based pricing (GPT-3.5, GPT-4), Fine-tuning: case-by-case, Total agent cost variable (~$5kā$7k+/month) Vector Search: $605ā$760/month for 5M vectors, Model Serving: $90ā$120 per million tokens, Fine-Tuning Llama 3.3: $146ā$7,150, Managed Compute built into DBU usage, End-to-end AI Agent ~$5kā$7k+/month Use Cases / Capabilities Agents intelligent, can interact/modify responses; single AI search per agent; infrastructure setup required; custom MCP server registration Agents intelligent, interact/modify responses; AI search via APIs (Google/Bing); in-built MCP server; complex infrastructure; slower responses as results batch sent together Development Approach Low-code, faster agent creation, SDK-based, easier experimentation Manual coding using MLflow library, more customization, API integration, higher chance of errors, slower build Models Comparison 90 models, Azure OpenAI (GPT-3.5, GPT-4), multi-modal ~10 base models, OSS & partner models (Llama, Claude, Gemma), many models donāt support tool usage Knowledge Source One knowledge source of each type (adding new replaces previous) No limitation; supports data cleaning via Medallion Architecture; SQL-only access inside agent; Spark/PySQL not supported in agent Memory / Context Window 8Kā128K tokens (up to 1M for GPT-4.1) Moderate, not specified Modalities Text, code, vision, audio (some models) Likely text-only Special Enhancements Turbo efficiency, reasoning, tool calling, multimodal Varies per model (Llama, Claude, Gemma architectures) Availability Deployed via Azure AI Foundry Through Databricks platform Limitations Only one knowledge source of each type, infrastructure complexity for MCP server No multi-modal Spark/PySQL access, slower batch responses, limited model count, high manual development 7. Lessons Learned: … Continue reading Inside SmartPitch: How CloudFronts Built an Enterprise-Grade AI Sales Agent Using Microsoft and Databricks Technologies
Share Story :
Before You Add AI, Fix Your Foundations: How to Prepare Your Data for Intelligent Tools
Everyone wants AI. Few are ready for it. The question isnāt āWhen do we start?ā but āAre we prepared to get it right?ā Because switching on Copilots without fixing your foundations doesnāt accelerate you. it amplifies chaos. This article will cover how to fix your foundations for AI so that the AI tools you deploy are accurate and reliable. Challenges of deploying AI Directly Some of the common challenges of directly deploying AI on top of your business applications are – And these issues just render the AI implementation as a failure immediately dismissing trust in using AI at all. But these challenges can be overcome once the foundations of AI are in place which weāll discuss in the next section. Foundation of AI At CloudFronts, we call this the 3 Pillars of AI Readiness: Hereās how I sum up the foundation of the systems for AI – For example, when CloudFronts helped Tinius Olsen modernize their systems, the focus wasnāt just technical uplift. It was about ensuring every business process was cloud-ready so AI models could actually trust the data. Upgrading from legacy systems And this is the foundation that needs to be had before AI can be implemented at your organization. Data & AI Maturity Curve by Databricks Given the above foundations in place for your AI Adoption strategy and choosing the right framework for your implementation, the Data & AI Maturity Curve shown below can be referenced to see where your organization is on the curve and where do you want to get to – On a high level, the foundation will get you to look back at the data and see what has happened in the past and AI tools can help you get this information accurately. Further, once trust is established, actions like making the AI predict the future state of operations, prescribe steps and even take decisions on our behalf can be achieved ā provided you really want that to happen. It might be too soon just yet. To conclude, AI success = Foundations Ć Trust. Without modern systems, connected data, and governed access, AI is just noise. But with these in place, every AI tool you deploy whether predictive analytics or Copilots becomes an accelerator for decision-making, not a distraction. Before you deploy AI, fix your foundations. If youāre serious about making AI a trusted accelerator not a costly experiment start with modernization, connection, and governance. At CloudFronts, we help enterprises build these foundations with confidence. Letās connect over our email: Transfrom@cloudfronts.com
Share Story :
From Manual Orders to Global Growth: How E-Commerce + ERP Integration Transformed this company
In todayās global manufacturing landscape, businesses need more than just strong products to stay competitive. They need digital operations that connect customers, distributors, and internal teams in different regions. One powerful way to achieve this is by integrating e-commerce platforms with enterprise resource planning (ERP) systems. This is the story of a 140-year-old global leader in materials testing machine manufacturing that transformed its order-taking process through a ShopifyāDynamics 365 Finance & Operations integration. The Challenge With offices in five countries and sales across the UK, Europe, China, India and multiple U.S. territories, this manufacturer had a truly global footprint. Yet, order-taking remained manual and inefficient: In short: their legacy setup couldnāt keep up with modern customer expectations or their own ambitions for global growth. The Solution Over the course of a decade long partnership, we helped the company modernize and digitize its business processes. The centre piece was a seamless integration between Shopify and Dynamics 365 Finance & Operations (F&O), built natively within F&O (no recurring middleware costs). Key integrations included: This solution ensured that high data volumes and complex processing demands could be handled efficiently within F&O. The Results The change has reshaped how the company works: Lessons for Other Global Manufacturers This journey highlights critical lessons for manufacturers, distributors, and global businesses alike: The Road Ahead After integrating Shopify with Dynamics 365 F&O, the company has launched a dedicated distributor website where approved distributors can place orders directly on behalf of customers. This portal creates a new revenue stream, strengthens the distribution network, and ensures orders flow into F&O with the same automation, inventory sync, and reporting as direct sales. By extending digital integration to distributors, the company is simplifying order-taking while expanding its business model for global growth. Ending thoughts The journey of this global manufacturer shows that true digital transformation isnāt about adding more tools, itās about connecting the right ones. By integrating Shopify with Dynamics 365 F&O, they moved from fragmented, manual processes to a scalable, automated ecosystem that empowers customers, distributors, and internal teams alike. For any organization operating across regions, the lesson is clear: e-commerce and ERP should not live in silos. When they work together, they create a foundation that not only accelerates order taking but also unlocks new revenue streams, sharper insights, and stronger global relationships. In a world where speed, accuracy, and customer experience define competitiveness, the question isnāt whether you can afford to integrate, itās whether you can afford not to. Whatās next: Donāt let manual processes slow you down. Connect with us at transform@cloudfronts.com and letās design an integration roadmap tailored for your business.
Share Story :
Setting Up Unity Catalog in Databricks for Centralized Data Governance
The fastest way to lose control of enterprise data? Managing governance separately across workspaces. Unity Catalog solves this with one centralized layer for security, lineage, and discovery. Data governance is crucial for any organization looking to manage and secure its data assets effectively. Databricksā Unity Catalog is a centralized solution that provides a unified interface for managing access control, auditing, data lineage, and discovery. This blog will guide you through the process of setting up Unity Catalog in your Databricks workspace. What is Unity Catalog? Unity Catalog is Databricks’ answer to centralized data governance. It enables organizations to enforce standards-compliant security policies, apply fine-grained access controls, and visualize data lineage across multiple workspaces. It ensures compliance and promotes efficient data management. Key Features: 1] Standards-Compliant Security: ANSI SQL-based access policies that apply across all workspaces in a region. 2] Fine-Grained Access Control: Support for row- and column-level permissions. 3] Audit Logging: Tracks who accessed what data and when. 4] Data Lineage: Provides visualization of data flow and dependencies. Unity Catalog Object Hierarchy Before diving into the setup, it’s important to understand the hierarchical structure of Unity Catalog: 1] Catalogs: The top-level container (e.g., Production, Development) that represents an organizational unit or environment. 2] Schemas: Logical groupings of tables, views, and AI models within a catalog. 3] Tables and Views: These include managed tables fully governed by Unity Catalog and external tables referencing existing cloud storage. Here is the procedure to setup a Unity Catalog Metastore in association with Azure Storage, as I have done for one of our products (SmartPitch Sales & Marketing Agent) ā 1] First create a storage account with primary service being ā āAzure Blob Storage or Azure Data Lake Storage Gen 2ā; Performance and Redundancy can be chosen based on the requirement for which the DataBricks service is being used.Here for my Mosaic AI Agent, I have used Locally Redundant Storage & Data Lake Gen 2 2] Once the storage account is created, ensure that you have enabled āHierarchical Namespaceā When creating a Unity Catalog metastore with Azure Blob Storage, Hierarchical Namespace (HNS) is required because Unity Catalog needs: a] Folder-like structure to organize catalogs, schemas, and tables. b] Atomic operations (rename, move, delete) on directories and files. c] POSIX-style access controls for fine-grained permissions. d] Faster metadata handling for lineage and governance. HNS turns Azure Blob into ADLS Gen2, which supports these features. 3] Upload any Raw/Unclean files to your metastore folder in the blob storage, which would be required for your use in DataBricks. 4] Create a Unity Catalog Connector in Azure Portal and assign it āStorage Blob Data Contributorā Role . 5] Assign CORS (Cross-Origin Resource Sharing) settings for that storage account. Why this is necessary: In short: Without configuring CORS, Databricks cannot communicate with your storage container to read/write managed tables, schema metadata, or logs. 6] Generate SAS Token 7] Navigate to your Workspace and select āManage Accountā – this should be done from the account admin. 8] Select Catalog tab on the left and then click āCreate Metastoreā 9] Assign a Name, Region (Same as Workspace), The path to the storage account, and the connector id. 10] Once the Metastore is created, assign it to a workspace . 11] Once this is done, the catalogs and the schemas, and tables in within it can be created. How does Unity Catalog differ from Hive Metastore ? Feature Hive Metastore Unity Catalog Scope Workspace or cluster-specific Centralized, spans multiple workspaces and regions Architecture Single metastore tied to Spark/Hive Cloud-native service integrated with Databricks Object Hierarchy Databases ā Tables ā Partitions Catalogs ā Schemas ā Tables/Views/Models Data Assets Supported Tables, views Tables, views, files, ML models, dashboards Security Basic GRANT/DENY at database/table level Fine-grained, ANSI SQLābased (catalog, schema, table, column, row) Lineage Not available Built-in lineage and impact analysis Auditing Limited or external Integrated audit logs across workspaces Storage Management Points to storage locations; no governance Manages external and managed tables with governance Cloud Integration Primarily on cluster storage or external path Secure integration with ADLS Gen2, S3, GCS Permissions Model Spark SQL statements Attribute- and role-based access, unified policies Use Cases Basic metadata store for Spark/Hive workloads Enterprise-wide data governance, sharing, and compliance To conclude, Unity Catalog is the next-generation governance and metadata solution for Databricks, designed to give organizations a single, secure, and scalable way to manage data and AI assets. Unlike the older Hive Metastore, it centralizes control across multiple workspaces, supports fine-grained access policies, delivers built-in lineage and auditing, and integrates seamlessly with cloud storage like Azure Data Lake, S3, or GCS. When setting it up, key steps include: 1] Creating a metastore and linking it to your workspaces. 2] Enabling hierarchical namespace on Azure storage for folder-level security and operations. 3] Configuring CORS to allow Databricks domains to interact with storage. 4] Defining catalogs, schemas, and tables for structured governance. By implementing Unity Catalog, you ensure stronger security, better compliance, and faster data discovery, making your Databricks environment enterprise-ready for analytics and AI. Business Outcomes of Unity Catalog By implementing Unity Catalog, organizations can achieve: Why now? As data volumes and regulatory requirements grow, organizations can no longer rely on fragmented or legacy governance tools. Unity Catalog offers a future-proof foundation for unified data management and AI governanceāessential for any modern data-driven enterprise. At CloudFronts, we help enterprises implement and optimize Unity Catalog within Databricks to ensure secure, compliant, and scalable data governance for enterprise data governance.Book a consultation with our experts to explore how Unity Catalog can simplify compliance and boost productivity for your teams.Contact us today at Transform@cloudfronts.com to get started. To learn more about functionalities of DataBricks and other Azure AI services, please refer to my other blogs from the links given below: – 1] The Hidden Cost of Bad Data:How Strong Data Management Unlocks Scalable, Accurate AI – CloudFronts 2] Automating Document Vectorization from SharePoint Using Azure Logic Apps and Azure AI Search – CloudFronts 3] Using Open AI and Logic Apps to develop a Copilot agent for … Continue reading Setting Up Unity Catalog in Databricks for Centralized Data Governance
Share Story :
From Data Chaos to Clarity: The Path to Becoming AI-Ready
Most organizations donāt fail because of bad AI models; they fail because their data isnāt clear. When your data is consistent, connected, and governed, your systems begin to ādo the talkingā for your teams. In our earlier article on whether your tech stack is holding you back from achieving AI success, we discussed the importance of building the right foundation. This time, we go deeper because even the best stack cannot deliver without data clarity. Thatās when deals close faster, billing runs itself, projects stay on track, and leaders decide with confidence. This blueprint shows how to move from fragments to structure to precision. The Reality Check For many businesses today, operations still depend on spreadsheets, scattered files, and long email threads. Reports exist in multiple versions, important numbers are tracked manually, and teams spend more time reconciling than acting. This is not a tool problem, it is a clarity problem. If your inventory lives in spreadsheets, if the ālatest reportā means three versions shared over email, no algorithm will save you. What scales is clarity: one language for data, one source of truth, and one way to connect systems and decisions. š” If you cannot trust your data today, you will second guess your decisions tomorrow. What Data Clarity Really Means (in Plain Business Terms) Clarity is not a buzzword. It makes data usable and enables scalability, giving every AI-ready business its foundation. Hereās what that looks like in practice: š”Clarity is not another software purchase. It is a shared business agreement across systems and teams. From Chaos to Clarity: The Three Levels of Readiness Data clarity evolves step by step. Think of it as moving from fragments to gemstones, to jewels. Level 1: Chaos (fragments everywhere) Data is spread across applications, spreadsheets, inboxes, and vendor portals. Duplicates exist, numbers conflict, and no one fully trusts the information. Level 2: Structure (the gemstone taking shape) Core entities like Customers, Products, Projects, and Invoices are standardized and mapped across systems. Data is stored in structured tables or databases. Reporting becomes stable, handovers reduce, and everyone begins pulling from a shared source. Level 3: Composable Insights (precision) Data is modular, reusable, and intelligent. It feeds forecasting, guided actions, and proactive alerts. Business leaders move from asking āwhat happened?ā to acting on āwhat should we do next?ā [Image Alt: Data Fragments to Jewel Levels] How businesses progress: š”Fragments refined into gemstones, and gemstones polished into jewels. Data clarity and maturity are the jewels of your business. In fragments, they hold little value. The Minimum Readiness Stack (Microsoft First) You donāt need a long list of tools. You need a stack that works together and grows with you: š”A small, well-integrated Tech stack outperforms a large, disconnected toolkit, every time. What Data Clarity Unlocks Clarity does not just organize your data. It transforms how systems work with each other and how your business operates. CloudFronts provided solutions, the real-world impact: Benefits enabled across departments in your business: š”When systems talk, your teams stop chasing data. They gain clarity. And clarity means speed, control, and growth. Governance That Accelerates Governance is not red tape, it is acceleration. Hereās how effective governance translates into business impact. Proof in action includes: š”Trust in data is engineered, not assumed. Outcomes That Matter to the Business Clarity shows up in business outcomes, not just dashboards. Tinius Olsen-Migrating from TIBCO to Azure Logic Apps for seamless integration between D365 Field Service and Finance & Operations – CloudFronts BĆCHIās customer-centric vision accelerates innovation using Azure Integration Services | Microsoft Customer Stories š”Once clarity is established, every new process, report, or integration lands faster, cleaner, and with more impact. To conclude, getting AI-ready is not about chasing new models. It is about making your data consistent, connected, and governed so that systems quietly remove manual work while surfacing what matters. All of this leads to one truth: clarity is the foundation for AI readiness. This is where Dynamics 365, Power Platform, and Azure integrations shine, providing a Microsoft-first path from fragments to precision. If your business is ready to move from fragments to precision, letās talk at transform@cloudfronts.com CloudFront will help you turn data chaos into clarity, and clarity into outcomes.
Share Story :
The Hidden Cost of Bad Data:How Strong Data Management Unlocks Scalable, Accurate AI
. Key Takeaways 1. Bad data kills ML performance ā duplicate, inconsistent, missing, or outdated data can break production models even if training accuracy is high.2. Databricks Medallion Architecture (Raw ā Silver ā Gold) is essential for turning raw chaos into structured, trustworthy datasets ready for ML & BI.3. Raw layer: capture all data as-is; Silver layer: clean, normalize, and standardize; Gold layer: curate, enrich, and prepare for modeling or reporting.4. Structured pipelines reduce compute cost, improve model reliability, and enable proactive monitoring & feature freshness.5. Data quality is as important as algorithms ā invest in transformation and governance for scalable, accurate AI. While developing a Machine Learning Agent for Smart Pitch (Sales/Presales/Marketing Assistant chatbot) within the Databricks ecosystem, Iāve seen firsthand how unclean, inconsistent, and incomplete data can cripple even the most promising machine learning initiatives. While much attention is given to models and algorithms, what often goes unnoticed is the silent productivity killer lurking in your pipelines: bad data. In the chase to build intelligent applications, developers often overlook the foundational layerādata quality. Hereās the truth: It is difficult to scale machine learning on top of chaos. Thatās where the Raw ā Silver ā Gold data transformation framework in Databricks becomes not just useful, but also essential. The Hidden Cost of Bad Data Imagine youāve built a high-performance ML model. Itās accurate during training, but underperforms in production. Why?Hereās what I frequently detect when I scan input pipelines:– Duplicate records– Inconsistent data types– Missing values– Outdated information– Schema drift (Schema drift introduces malformed, inconsistent, or incomplete data that breaks validation rules and compromises downstream processes.)– Noise in DataThese issues inflate compute costs, introduce bias, and produce unstable predictions, resulting in wasted hours debugging pipelines, increased operational risks, and eroded trust in your AI outputs. Ref.: Data Poisoning: A Silent but Deadly Threat to AI and ML Systems | by Anya Kondamani | nFactor Technologies | Medium As you can see in this example ā Despite successfully fetching data, it was unable to render it, due to hitting maximum request limit, as bad data was involved, thus AI Agents face issues if directly brute forced with Raw/Unclean data. But this isnāt just a data science problem. Itās a data engineering problem. And the solution lies in structured, governed data managementābeginning with a robust medallion architecture. Ref.: Data Intelligence End-to-End with Azure Databricks and Microsoft Fabric | Microsoft Community Hub Raw ā Silver ā Gold: The Databricks Way Databricks ML Agents thrive when your data is managed through the medallion architecture, transforming raw chaos into clean, trustworthy features. For those new to Medallion architecture in Databricks, it is a structured data processing framework that progressively improves data quality through bronze (raw), silver (validated), and gold (enriched) layers, enabling scalable and reliable analytics. Raw Layer: Ingest Everything The raw layer is where we land all data, regardless of quality. It’s your unfiltered feedālogs, events, customer input, third-party APIs, CSV dumps, IoT signals, etc. e.g. CloudFronts Case studies as directly fetched from WordPress API This script automates the extraction, transformation, and optional upload of WordPress-based case study content into Azure Blob Storage as well as locally to dbfs of Databricks, making it ready for further processing (e.g., vector databases, AI ingestion, or analytics). What Weāve Done On running the above Raw to Silver Cleaning code in Databricks notebook we get a much formatted, relevant and cleaned fields specific to our requirement In Databricks Unity Catalog ā Schema ā Tables, it looks something like this: – Silver Layer: Structure and Clean Once ingested, we promote data to the silver layer, where transformation begins. Think of this layer as the data refinery. What happens here: Now we start to analyze trends, infer schemas, and prepare for more active feature generation. Once I have removed noise; I begin to see patterns. This script is part of a data pipeline that transforms unstructured or semi-clean data from a Raw Delta Table (knowledge_base.raw.case_studies) (Actually Silver, as we have already done much of the cleaning part in previous step) into a Gold Delta Table (knowledge_base.gold.case_studies) using Apache Spark. The transformation focuses on HTML cleaning, type parsing, and schema standardization, enabling downstream ML or BI workflow What have we done here ? Start Spark Session Initializes a Spark job using SparkSession, enabling distributed data processing within the Databricks environment. Define UDFs (User-Defined Functions) Read Data from Silver Table Loads structured data from the Delta Table: knowledge_base.raw.case_studies, which acts as the Silver Layer in the medallion architecture. Clean & Normalize Key Columns Write to Gold Table Saves the final transformed DataFrame to the Gold Layer as a Delta Table: knowledge_base.gold.case_studies, using overwrite mode and allowing schema updates. As you can see we have maintained separate schemas for Raw, Silver Gold etc; and at gold we finally managed to get a fully cleaned noiseless data suitable for our requirement. Gold Layer: Ready for ML & BI At the gold layer, data becomes a polished product, curated for specific use casesāML models, dashboards, reports, or APIs. This is the layer where scale and accuracy become feasible, because it finally has clean, enriched, semantically meaningful data to learn from. Ref.: What is a Medallion Architecture? We can further make it more efficient to fetch with vectorization Once we reach stage and test again in the Agent Playground, we no longer face the error we had seen previously as now it is easier for the agent to retrieve gold standard vectorized data. Why This Matters for AI at Scale The difference between āgood enoughā and āstate of the artā often hinges on data readiness. Hereās how strong data management impacts real-world outcomes: Without Medallion Architecture With Raw ā Silver ā Gold Data drift goes undetected Proactive schema monitoring Models degrade silently Continuous feature freshness Expensive debugging cycles Clean lineage via Delta Lake Inconsistent outputs Predictable, testable results (Delta Lake is an open-source storage layer that brings ACID transactions, scalable metadata handling, and unified batch processing to data lakes, enabling reliable analytics on massive datasets.) … Continue reading The Hidden Cost of Bad Data:How Strong Data Management Unlocks Scalable, Accurate AI
Share Story :
Is Your Tech Stack Holding You Back from AI Success?
The AI Race Has Begun but Most Businesses Are Crawling Artificial Intelligence (AI) is no longer experimental it’s operational. Across industries, companies are trying to harness it to improve decision-making, automate intelligently, and gain competitive edge. But hereās the problem: only 48% of AI projects ever make it to production (Gartner, 2024). Itās not because AI doesnāt work.Itās because most tech stacks arenāt built to support it. The Real Bottleneck Isnāt AI. Itās Your Foundation You may have data. You may even have AI tools. But if your infrastructure isnāt AI-ready, youāll stay stuck in POCs that never scale. Common signs youāre blocked: AI success starts beneath the surface, in your data pipelines, infrastructure, and architecture. Most machine learning systems fail not because of poor models, but because of broken data and infrastructure pipelines. What Does an AI-Ready Tech Stack Look Like? Being AI-Ready means preparing your infrastructure, data, and processes to fully support AI capabilities. This is not a checklist or quick fix. It is a structured alignment of technology and business goals. A truly AI-ready stack can: Area Traditional Stack AI-Ready Stack Why It Matters Infrastructure On-premises servers, outdated VMs Azure Kubernetes Service (AKS), Azure Functions, Azure App Services; then: AWS EKS, Lambda; GCP GKE, Cloud Run AI workloads need scalable, flexible compute with container orchestration and event-driven execution Data Handling Siloed databases, batch ETL jobs Azure Data Factory, Power Platform connectors, Azure Event Grid, Synapse Link; then: AWS Glue, Kinesis; GCP Dataflow, Pub/Sub Enables real-time, consistent, and automated data flow for training and inference Storage & Retrieval Relational DBs, Excel, file shares Azure Data Lake Gen2, Azure Cosmos DB, Microsoft Fabric OneLake, Azure AI Search (with vector search); then: AWS S3, DynamoDB, OpenSearch; GCP BigQuery, Firestore Modern AI needs scalable object storage and vector DBs for unstructured and semantic data AI Enablement Isolated scripts, manual ML Azure OpenAI Service, Azure Machine Learning, Copilot Studio, Power Platform AI Builder; then: AWS SageMaker, Bedrock; GCP Vertex AI, AutoML; OpenAI, Hugging Face Simplifies AI adoption with ready-to-use models, tools, and MLOps pipelines Security & Governance Basic firewall rules, no audit logs Microsoft Entra (Azure AD), Microsoft Purview, Microsoft Defender for Cloud, Compliance Manager, Dataverse RBAC; then: AWS IAM, Macie; GCP Cloud IAM, DLP API Ensures responsible AI use, regulatory compliance, and data protection Monitoring & Ops Manual monitoring, limited observability Azure Monitor, Application Insights, Power Platform Admin Center, Purview Audit Logs; then: AWS CloudWatch, X-Ray; GCP Ops Suite; Datadog, Prometheus AI success depends on observability across infrastructure, pipelines, and models In Summary: AI-readiness is not a buzzword. Not a checklist. Itās an architectural reality. Why This Matters Now AI is moving fast and so are your competitors. But success doesnāt depend on building your own LLM or becoming a data science lab. It depends on whether your systems are ready to support intelligence at scale. If your tech stack canāt deliver real-time data, run scalable AI, and ensure trust your AI ambitions will stay just that: ambitions. How We Help We work with organizations across industries to: Whether you’re just starting or scaling AI across teams, we help build the architecture that enables action. Because AI success isnāt about plugging in a tool. Itās about building a foundation where intelligence thrives. I hope you found this blog useful, and if you would like to discuss anything, you can reach out to us at transform@cloudfronts.com.
Share Story :
Building the AI Bridge: How CloudFronts Helps You Connect Systems That Talk to Each Other
When we say building a bridge? Does it mean something isnāt connected together? And what is it?Itās AI itself and your systems that are not connected. What this means if although your AI can access your systems to derive information, itās still unreliable, slow. What is needed for AI to be successful? In order for AI to be successful, below is what to avoid: In order to eliminate the above, we must have a layer of ācatalogā which will house all business data together so that a common vocabulary is established between systems. AI then pools from this āData catalogā to perform agentic actions. The diagram below best explains, on a high level, how this looks : And all this is defined by how well the integrations between these systems are established. How CloudFronts Can Help? CloudFronts has deep integration expertise where we connected cloud-based applications with each other with the below in mind – Often times, we find ready-made plug and play cloud-based integration solutions which come with their own hefty licensing that keeps going up every few years. Using such integration tools not only affects cash flow but also adds a layer of opaqueness, as we donāt control the flow of integration, and we cannot granularize it beyond whatās offered. Custom integration gives you better control and analytics, which readymade solutions canāt.Hereās a CloudFronts Case Study published by Microsoft, wherein we connected systems for our customer with multiple systems driving data and insights. To conclude, AI Agents are meant to be for your organization arenāt optimized to work right away. This disconnect needs to be engineered just like any other implementation project today. As this gap is real and must be fulfilled by something called Unity Catalog and integrations, CloudFronts can help bridge this gap and make AI work for your organization to continue to optimize cash flow against rising costs. We hope you found this blog useful, and if you would like to discuss anything, you can reach out to us at transform@cloudfonts.com.
Share Story :
Struggling with Siloed Systems? Hereās How CloudFronts Gets You Connected
In today’s world, we use many different applications for our daily work. One single application can’t handle everything because some apps are designed for specific tasks. Thatās why organizations use multiple applications, which often leads to data being stored separately or in isolation. In this blog, weāll take you on a journey from siloed systems to connected systems through a customer success story. About BĆCHI Büchi Labortechnik AG is a Swiss company renowned for providing laboratory and industrial solutions for R&D, quality control, and production. Founded in 1939, Büchi specializes in technologies such as: Their equipment is widely used in pharmaceuticals, chemicals, food & beverage, and academia for sample preparation, formulation, and analysis. Büchi is known for its precision, innovation, and strong customer support worldwide. Systems Used by BĆCHI To streamline operations and ensure seamless collaboration, BĆCHI leverages a variety of enterprise systems: Infor and SAP Business One are utilized for managing critical business functions such as finance, supply chain, manufacturing, and inventory. Reporting Challenges Due to Siloed Systems Organizations often rely on multiple disconnected systems across departments ā such as ERP, CRM, marketing platforms, spreadsheets, and legacy tools. These siloed systems result in: The Need for a Single Source of Truth To solve these challenges, itās critical to establish a Single Source of Truth (SSOT) ā a central, trusted data platform where all key business data is: How We Helped Büchi Connect Their Systems To build a seamless and scalable integration framework, we leveraged the following Azure services: >Azure Logic Apps ā Enabled no-code/low-code automation for integrating applications quickly and efficiently. >Azure Functions ā Provided serverless computing for lightweight data transformations and custom logic execution. >Azure Service Bus ā Ensured reliable, asynchronous communication between systems with FIFO message processing and decoupling of sender/receiver availability. >Azure API Management (APIM) ā Secured and simplified access to backend services by exposing only required APIs, enforcing policies like authentication and rate limiting, and unifying multiple APIs under a single endpoint. BĆCHI’s case study was published on the Microsoft website, highlighting how CloudFronts helped connect their systems and prepare their data for insights and AI-driven solutions. Why a Single Source of Truth (SSOT) Is Important A Single Source of Truth means having one trusted location where your business stores consistent, accurate, and up-to-date data. Key Reasons It Matters: How we did this We used Azure Function Apps, Service Bus, and Logic Apps to seamlessly connect the systems. Databricks was implemented to build a Unity Catalog, establishing a Single Source of Truth (SSOT). On top of this unified data layer, we enabled advanced analytics and reporting using Power BI. In May, we hosted an event with BĆCHI at the Microsoft Office in Zurich. During the session, one of the attending customers remarked, “We are five years behind BĆCHI.” Another added, “If we donāt start now, weāll be out of the race in the future.” This clearly reflects the urgent need for businesses to evolve. Today, Connected Systems, a Single Source of Truth (SSOT), Advanced Analytics, and AI are not optional ā they are essential for sustainable growth and improved human efficiency. The pace of transformation has accelerated: tasks that once took months can now be achieved in days ā and soon, perhaps, with just a prompt. To conclude, if you’re operating with multiple disconnected systems and relying heavily on manual processes, it’s time to rethink your approach. System integration and automation free your teams from repetitive work and empower them to focus on high impact, strategic activities. We hope you found this blog useful, and if you would like to discuss anything, you can reach out to us at transform@cloudfonts.com.
Share Story :
Getting Your Organization’s Data Ready for AI
Since the turn of 2025, AI has been thrown around a lot in conversations ā both individual and also at an organizational level. Major technology providers have started their own suite of tools to build AI agents. While these tools are good enough for simpler AI use cases like fetching data from systems and presenting to us, but complex use cases like predicting patterns, collating data from multiple systems and driving insights from connected systems ā that’s where AI implementations need to be looked at like projects which needs architecting and implementing with organizationās vision of AI. Letās look at how we can make sure that AI implementations give us over 95% accuracy and not just answers every time which we assume might be correct. Is AI enough by itself? Common perception that AI Agents are deployed on top of applications which can be used to interact with the underlying systems to do what users are supposed to get done from AI. This perception stems from our use of AI tools like ChatGPT/Claude/Gemini as they interact with the Internet to get your queries answered. Since this is a tool available independently, thereās not technical setup and it is ready to go. Speaking of being Copilot being enough on itself, it depends on where the data is sourced from ā and what the intent of the Agent is. If your Custom Copilot / AI Agent is meant to only look at some SharePoint files, some websites and within 1 system in your M365 gated access, you should be able to patch to knowledge sources and be good enough to let AI Agent give you the information in the format you need. Challenge occurs where you expecting the AI Agents to make sense of the data which is stored differently in different systems with different naming conventions ā that’s when AI agents will fall through because it cannot understand when you are pointing to an āAccountā in CRM, but the same is stored as a āCustomerā in Business Central. And this is where something like a Unity Catalog comes into picture. The term itself describes that the data comes together in a catalog for common access and AI agents to source from. Letās look at how we can imagine this unity catalog to be in the next section. Unity Catalog Unity Catalog can be thought of as an implementation strategy and collection of connected systems over which AI Agents can be based upon. Hereās how I summarize this process – Above diagram is a summary for how AI implementations will scale within organizations and have different variations of the same. To encapsulate, while independent AI agents can be implemented for personal use within the organization, given the appropriate privileges, for AI to make sense of and enable trusted decision making, AI implementations need to have data readiness in place with clarity. Hopefully, this topic summarizes the direction in which organizations can think of AI implementation, more than just building agents. We hope you found this blog useful, and if you would like to discuss anything, you can reach out to us at transform@cloudfonts.com.