Past Does Not Require Premium
by @pbrisbin on April 27, 2022
It’s not uncommon for Freckle to experience a period of high database load in September, as School traffic returns and code changes made over the summer start to see real scale for the first time. We typically spend a lot of that month finding small changes causing big performance problems and fixing them.
One “summer regression” from this past September was such an excellent example of a bug correctable through types, that I’d like to describe it here.
Premium Timespans
Freckle considers viewing certain historical data a premium feature and we have
a requireTimespan
helper that enforces that in the APIs for such data.
One such usage looks like this:
premiumLicenses <- fetchPremiumLicensesForSchool now schoolId
timeSpan <- requireTimeSpan MaxSpanAllTime PastDoesNotRequirePremium premiumLicenses
The function is given the current user’s licenses and two other arguments to control how to enforce things. Shortly after students returned to school, our instrumentation alerted us to unusually high database load. It turns out, this was caused by the addition of exactly this code to a very hot path.
Have you spotted the landmine yet?
requireTimeSpan MaxSpanAllTime PastDoesNotRequirePremium premiumLicenses
-- ^^^^^^^^^^^^^^^^^^^^^^^^^
If “Past Does Not Require Premium”, then all these queries are for naught. We’re fetching licenses that the underlying code doesn’t even consider!
The following would function exactly the same:
requireTimeSpan MaxSpanAllTime PastDoesNotRequirePremium mempty
And that’s exactly how we… considered resolving things at the time.
Instead, we wondered: how many other places might we be doing this? Should we
grep
for it? No, that’s a dynamic-typist talking. We should make it invalid
through the types!
Smarter Data
Here’s a small change with a large effect:
data PastRequiresPremium
- = PastRequiresMathPremium
- | PastRequiresElaPremium
- | PastRequiresAnyPremium
+ = PastRequiresMathPremium PremiumLicenses
+ | PastRequiresElaPremium PremiumLicenses
+ | PastRequiresAnyPremium PremiumLicenses
| PastDoesNotRequirePremium
deriving stock (Eq, Show)
This minor re-arrangement gets the compiler forcing us to fix this bug in all code, present and future.
Rather than requireTimeSpan
(et al) accepting PastRequiresPremium
and
PremiumLicenses
separately, we made the PastDoesNotRequirePremium
type
smarter. By holding the PremiumLicenses
only in the constructors that need it,
we can ensure we don’t demand them for the constructor that doesn’t.
This felt a bit like solving Data Clump, it’s following Smart data structures and dumb code, and has similar effectiveness to Make Invalid States Unrepresentable.
Chasing Compiler Errors
From here, we can’t get things wrong. The compiler forces us to never need or
use PremiumLicenses
for the PastDoesNotRequirePremium
case:
-> PastRequiresPremium
-> UTCTime
-> TimeSpan
- -> PremiumLicenses
-> Validation ValidationErrors ()
- validateTimeSpan maxTimeSpan pastRequiresPremium now timeSpan premiumLicenses
+ validateTimeSpan maxTimeSpan pastRequiresPremium now timeSpan
| needToCheckPremium && not hasValidPremium = legacyFailure Errors.NoPremium
| isValidTimeSpan = success ()
| otherwise = legacyFailure Errors.InvalidDateRange
where
needToCheckPremium =
deltaTimeSpan > week && TimeSpan.elem oneWeekAgo timeSpan
hasValidPremium = case pastRequiresPremium of
- PastRequiresAnyPremium -> premiumLicensesHasAny premiumLicenses
- PastRequiresMathPremium -> premiumLicensesHasMath premiumLicenses
- PastRequiresElaPremium -> premiumLicensesHasEla premiumLicenses
+ PastRequiresAnyPremium premiumLicenses ->
+ premiumLicensesHasAny premiumLicenses
+ PastRequiresMathPremium premiumLicenses ->
+ premiumLicensesHasMath premiumLicenses
+ PastRequiresElaPremium premiumLicenses ->
+ premiumLicensesHasEla premiumLicenses
PastDoesNotRequirePremium -> True
Cases that do require licenses take a minor hit to calling syntax:
- timeSpan <- requireTimeSpan MaxSpanAllTime PastRequiresElaPremium premiumLicenses
+ timeSpan <- requireTimeSpan MaxSpanAllTime $ PastRequiresElaPremium premiumLicenses
Cases that don’t get simpler:
- getFetchAssignmentsAndSessionsArgs premiumLicenses = do
+ getFetchAssignmentsAndSessionsArgs = do
mCompletedAt <- sendValidateT $ optionalUTCTimeFilter
MaxSpanAllTime
PastDoesNotRequirePremium
- premiumLicenses
"completed-at"
mAssignmentIds <- optionalGetParam "assignmentIds"
And eventually we are led directly to our original hot path:
- now <- liftIO getCurrentTime
- premiumLicenses <- runDB $ fetchPremiumLicensesForSchool now schoolId
- args <- getFetchAssignmentsAndSessionsArgs premiumLicenses
+ args <- getFetchAssignmentsAndSessionsArgs
This graph is DB load the day the fix went out (~10AM) with a dotted-line representing the day before:
Prologue
Apparently we hit this issue at some point in the past already. That time, we chose to solve it in a more explicit way:
unless (requiresPremium == PastDoesNotRequirePremium) $ do
teacherId <- requireTeacherId
premiumLicenses <- runDB $ fetchPremiumLicenses now teacherId
sendValidation $ validateTimeSpan
maxTimeSpan
requiresPremium
now
timeSpan
premiumLicenses
This is exactly the sort of thing folks mean when they talk about conditionals as a smell and the benefits of moving business logic into the types. This original fix, in addition to being localized and unable to prevent the same regression elsewhere, also introduced another bug: we were nope-ing out of all validation of the timespan parameters in order to avoid fetching licenses unnecessarily for the premium portion of it.
Now that the types are better, the obvious code is also the correct code:
sendValidation $ validateTimeSpan maxTimeSpan requiresPremium now timeSpan