Backend Engineering 2024

Intelligent Notification & Messaging Engine

Designed an event-driven communication system that delivers alerts via SMS, email, and in-app notifications based on business triggers. Includes retry logic, scheduling, user preference management, and template rendering — ensuring reliable message delivery without coupling notification concerns into core business services.

Technology Stack:
PythonMessage QueueBackground WorkersAPI

Problem Statement

Notification logic scattered across business services becomes a maintenance problem: duplicate delivery code, inconsistent retry handling, no central audit trail of what was sent to whom and when, and difficulty adding new channels or respecting user communication preferences. Centralising notifications into a dedicated engine removes this complexity from application code while providing reliability, personalisation, and observability.

Key Challenges:

  • Reliable delivery across third-party channels (SMS, email) with variable uptime
  • User preference management — channel, frequency, and opt-out handling
  • Preventing duplicate delivery on retries
  • Template rendering supporting personalisation tokens and conditional content
  • Scheduling future notifications and cancelling them if the triggering condition resolves

System Architecture

Business services publish notification events to a message queue. The engine consumes events, resolves user preferences, renders templates, and dispatches to the appropriate channel via provider APIs. Background workers handle retries for failed deliveries. A delivery log records every dispatch attempt and outcome.

Event-Driven Intake

Services publish typed notification events (e.g., OrderConfirmed, PaymentFailed, AppointmentReminder) to a queue. The engine consumes events and resolves recipient identity, preferences, and the appropriate template without the publishing service knowing delivery details.

Preference Resolution

Per-user preference store defines allowed channels, quiet hours, frequency limits, and opt-out status per notification category. The engine filters delivery channels before dispatch, ensuring user preferences are respected without burdening the publishing service.

Template Engine

Notification templates support personalisation tokens, conditional blocks, and localisation. Templates are stored and versioned separately from code, allowing content updates without deployment. Previews are available for review before publishing.

Delivery & Retry

Each delivery attempt is logged with the provider response. Failed deliveries are retried with exponential backoff up to configurable limits. Idempotency keys prevent duplicate messages on retry. Persistent failure triggers an alert for investigation.

Key Engineering Challenges

Duplicate Delivery on Retry

Challenge: Retrying a failed delivery without idempotency guarantees sends the same message multiple times, frustrating recipients.

Solution: Idempotency key derived from event ID and recipient channel stored in the delivery log. Before dispatching, the engine checks whether this combination has already been successfully delivered.

Provider API Reliability

Challenge: SMS and email providers have variable availability; a provider outage should not permanently drop notifications.

Solution: Per-provider circuit breaker with fallback to alternate providers where configured. Failed events remain in the queue until the provider recovers or the max retry window expires.

Scheduled Notification Cancellation

Challenge: An appointment reminder scheduled 24 hours ahead should be cancelled if the appointment is cancelled before the reminder fires.

Solution: Scheduled notifications are stored as pending jobs with a cancellation key. Publishing a cancellation event with the same key removes the pending job before it executes.

Frequency Capping

Challenge: System events can generate bursts of notifications that overwhelm users with messages in a short period.

Solution: Per-user per-category frequency counters in Redis enforcing maximum message rates. Excess notifications within the capping window are dropped or deferred to the next allowed window.

Solutions Implemented

  • Event-Driven Architecture: Decoupled intake via message queue, separating notification concerns entirely from business service code.
  • Preference Engine: Per-user opt-out, channel selection, quiet hours, and frequency caps enforced before every delivery.
  • Versioned Templates: Personalised, localised templates stored separately from code with preview capability and A/B variant support.
  • Idempotent Delivery: Delivery log with idempotency keys preventing duplicate sends on retry or event replay.
  • Delivery Audit Trail: Complete log of every dispatch attempt, provider response, and final outcome per notification event.

Outcome & Impact

99.5% Delivery Rate

Across all channels

Zero Duplicate Deliveries

With idempotency enforcement

3 Channels Unified

SMS, Email, In-App

100% Preference Compliance

User opt-outs always respected