Azure and Office 365Introduction to Azure Event Hubs
- Microsoft Azure Event Hubs is a managed platform service that can intake a large amounts of data for various scenarios.
- It is a highly scalable, low latency and highly-scalable data ingest system. Data is ingested here in the form of events.
- Event Publishers submit data to the Event Hub and Event Consumers consume the data at their own time.
- Some scenarios where Event Hubs are applicable are – Application Instrumentation, user experience, Internet of Things (IoT)
- Event Hubs reside in the Service Bus namespace.
- Event Hubs uses AMQP and HTTP as its primary API interfaces.
Below diagram will give a high level overview of where Event Hubs lie:
Partitions in Event Hubs
- Partitions are ordered sequence of events that reside in the event hubs. Newer partitions are added to the end of the queue as they arrive.
- Partitions retain data for a configured period of time. This setting is common across all partitions in the Event Hub.
- Every partitioned is populated at their own pace and not necessarily sequentially. Hence, data in partitions grow independently.
- of partitions are specified while creation of Event Hubs. This number should be between 2 and 32. Default partitions allotted are 4.
- The number of partitions you choose are more related to the number of concurrent consuming applications you expect to have.
- This partition count cannot be changed once the event hub is created.
So who are event publishers?
- The entities that populates data to the event hubs are the event publishers.
- Event publishers can publish data to the event hubs either using HTTPS or AMQP 1.0.
- Event publishers use a SAS (Shared Access Key) token to authenticate themselves to Event Hubs.
Common Tasks for a Publisher
Acquire an SAS Token
- SAS is the authentication mechanism for Event Hubs. Service Bus provides SAS policies at the namespace level and at the Event Hub level.
- Service Bus can regenerate the key and authenticate the sender.
Publishing an Event
- Service Bus provides an EventHubClient class for publishing events to an Event Hub from .NET clients.
- Events can either be published individually or batched.
- A single publication has a limitation of 256 KB, whether batch or individually. Publishing events larger than this will result in error.
- It is a value to map incoming messages to the specific partitions for data organization purpose.
- This is a sender supplied value passed to the event hub.
- It is processed through a hashing function which creates the partition assignment.
- Partition Keys are important for organizing data for downstream processing.
Below Diagram explains how Partition Key work:
- An entity that reads event data from Event Hub is an event consumer.
- All event consumers read from the partitions in the consumer group.
- Each partition should only have one active reader at a time.
- A consumer groups is a view (state, position, or offset) of the entire Event Hub.
- Consumers groups lets consuming applications have a separate view of the entire Event Hub.
- There is always a default consumer group. You can create up to 20 consumer groups in an Event Hub.
Stream Offsets & Checkpointing
- An offset is a position of an event within a partition. It is a client-marker to specify at which point should the processing should happen from.
- Consumers should store their own offsets.
- Checkpointing is a process where readers mark their position in a partition in the event hubs.
Common Consumer Tasks
- All consumers connect to the Event Hub via AMQP 1.0. It is a session and state-aware bidirectional communication channel.
- As a partitioned consumer model, only 1 consumer can be active on a partition at a time within a consumer group.
- The following data is read from the Event Hub
- Sequence Number
- User Properties
- System Properties.
- As mentioned above, it is user’s responsibility to maintain this offset.
So now you know about Event Hubs!
Azure Event Hubs provide high-scalable, telemetry processing service that can be used for common applications. Event Hubs provide low latency.
In the next part of the blog, I’ll be covering a technical look at the Event Hubs wherein Dynamics CRM will publish data to the Event Hubs and how data is available for applications to consume! Watch out here for the upcoming blog very soon! Hope this overview was helpful.