We’ve been on an incredible journey of expansion and change over the last eight years with Blackthorn Events. It’s a natural step for businesses like ours, as we scale, to continually update and enhance our infrastructure. This ensures we’re always equipped to manage increased complexity and volume, ensuring you, our valued customers and partners, receive the seamless service you’ve come to expect.
Transparency is fundamental to our values, and in the spirit of openness, we are providing you with detailed information about the recent challenges we’ve encountered, our strategies for addressing them, and an estimated timeline for resolution.
Redis Cache: Short-term Improvements
Blackthorn caches public Event data, along with nuances such as ticket quantities, with Redis databases. Our Redis instances offer high availability but have limitations in storage, memory resources, and buffer size. When the value stored for any specific key (File containing your org’s event data) goes over the buffer size limit, it creates latency and, in the worst cases, could cause the Redis request to reset and retry due to the limitations. Recent unplanned maintenance from our Redis provider, Redislabs, has further impacted performance, particularly in the month of October (2023).
We’ve upgraded our US Redis plan, enhancing resources and performance and serving as a temporary fix. We aim to ensure a consistent and speedy experience across all applications.
All regions now utilize dedicated Redis resources. The US, due to its customer base and data size, is on an upgraded plan, but we will continue to monitor other regions and assess upgrade needs.
Regionalization Update: Reducing Geographic Latency
Our Events webapp infrastructure resources are distributed across the US, Europe, APAC, and Australia. However, they rely on some shared US-based resources located in North Virginia, which can result in latency issues due to geographic location distance.
Connect360: Our own intermediary application linking the Webapp and Salesforce organizations on AWS.
MongoDB: Used for storing Connect360 configurations.
Redis: An in-memory cache aimed at reducing data access latency.
We have replicated these instances across Europe, Australia, and APAC, significantly improving performance and reducing latency.
Customers in Australia, APAC, and Europe are already benefiting from these regional resources.
Traffic Redirection: Enhancing Availability
Using Cloudflare’s proxying, we direct customers to the Webapp in their specific geographic region.
Remediation Steps Taken:
If regional resources are unavailable, we’ll reroute traffic to another region, ensuring higher availability until issues are resolved.
Data Persistence and Redis Cache: Long Term Improvements
As noted above, in order to provide the fastest and most resource-efficient experience, Blackthorn Events caches Event data from Salesforce in Redis as key-value pairs so it can easily be retrieved to quickly load Event pages. We’ve seen a large increase in both the quantity and complexity of events for many customer orgs, which results in large file sizes and higher traffic on our Redis instances. These large file sizes create latency issues which result in delays or failures in Events data updates from Salesforce being displayed on Event and Event Group pages.
Phase 1: Data Persistence
The initial phase involved creating a relational Postgres database outside Salesforce to store a synchronized copy of Event data, excluding PII or GDPR-sensitive information. This bypasses Salesforce API and query limitations, ensuring the most current data is readily available for Redis cache. This establishes a crucial data layer for the subsequent phase, which will soon be in development.
This phase is complete and feature flagged in our October release. Some post-launch issues are being actively addressed so that we can remove the feature flag.
Phase II: Cache Refactor
To alleviate the latency issues created by large key file sizes, we will be refactoring our cache approach to store event data in small, more granular files.
Our team is actively working on defining and identifying the optimal approach. We’ll be building POCs over the coming weeks to test our solution before handing it over to our core Engineering team to get production ready. We anticipate these changes to be delivered in early 2024.
We are committed to providing a reliable and high-performance experience for all Blackthorn customers and partners. Your trust in our service is paramount, and we sincerely appreciate your patience as we implement these enhancements. If you have any questions or require further information, please do not hesitate to contact us.
Blackthorn Engineering Team & Andrea Adcock, Chief Product Officer (email@example.com)