From Fetch to Fetchless

An exercise in trade-offs in encoding DSLs

Ryan Peters @ 47 Degrees

Builds on top of my ScalaCon 2021 Talk:

Your Program Is A Language

What is Fetch?

  • Library for optimizing requests for data
  • Automatically batches independent requests
  • Deduplicates requests made in sequence
  • Ensures requests run in parallel whenever possible

//Represents a K/V store where strings have an integer ID
val dataSource: DataSource[IO, Int, String] = ???

def fetch(i: Int) = Fetch.optional(i, dataSource)

//Fetches both 1 and 2, at once
//Deduplicates the second request for 1
val dedupedFetch = (fetch(1), fetch(2), fetch(1)).tupled

Fetch.run(dedupedFetch).map {
	case (Some(first), Some(second), Some(firstAgain)) => ???
	//And so on...
}
				

Fetch == A DSL?

DSLs are just high-level code

DSLs can be written in multiple styles

Fetch is written in a "free" style, as a data structure


//Defines all terms in our language
sealed trait MyLanguage

//Concepts, operations, etc are just data
final case class MyIntOp(i: Int) extends MyLanguage
final case class MyStrOp(s: String) extends MyLanguage

//We need a way to combine our ops into a program
//List is a simple way to imply a sequence or ordering
val myProgram = List(MyIntOp(42), MyStrOp("Hello world!"))

//Programs as data structure must be interpreted explicitly
def interpret(ops: List[MyLanguage]): Unit
				

List is a "free monoid" - giving us the ability to combine ops "for free"

Other "free" variants:

  • Free Monad - lets us describe dependent computations (flatMap)
  • Free Applicative - lets us describe independent computations (product, etc)

You can find these in Cats and other libraries

Note on Applicative v Monad

Monads MUST be dependent, Applicatives can't


//Must work as a strict, dependent sequence
List(Right(1), Left(1), Right(1))

//Applicatives are independent,
//so there is room for optimizing
List(Valid(1), Invalid(1), Valid(1))
				

How this affected Fetch

Cats 2.7.0 changed an implementation detail of the type classes Fetch used.

Fetch is unlawful - it wants to be Monadic without the guarantee of sequencing

Fetch 3 won't guarantee automatic sequence batching

Fetch 3.1.0 brings it back w/ a targeted override

A New Fetch

  • Removing auto-batching makes it a different library
  • Fetch has various "problems" stemming from its design style
  • It's a fun proof-of-concept exercise! :)

Fetch "Problems"

DataSource is heavy


trait DataSource[F[_], I, A] {
	def data: Data[I, A] //Used to optimize access

	//Needs to capture the Concurrent instance
	implicit def CF: Concurrent[F]

	def fetch(id: I): F[Option[A]]
	def batch(ids: NonEmptyList[I]): F[Map[I, A]]
	
	//Smells like implementation details
	def maxBatchSize: Option[Int] = None
	def batchExecution: BatchExecution = InParallel
}
				

Running requires many capabilities


//Slightly simplified from the source code
//You need `Concurrent` and `Clock` wherever you run a fetch
def run[F[_]: Concurrent: Clock, A](fa: Fetch[F, A])
				

Logging, etc must be baked-in

Functions such as runLog need to be hard-coded in, and you can't write your own

See: GitHub Issue #429 where a user wants to define a timeout on a single Fetch, which is not possible

Converting Free -> Tagless

  • Programs are not deferred, but interpreted immediately
  • Data becomes syntax - i.e. MyIntOp becomes [int].myOp
  • You can capture capability away from your program

A Basic Fetch DSL


//Replaces DataSource
trait Fetch[F[_], I, A] {
	val id: String //Naive identifier per-instance
	def single(i: I): F[Option[A]]
	def many(is: Set[I]): F[Map[I, A]]
}

val intFetch: Fetch[IO, Int, Int] =
  Fetch.echo[IO, Int, Int]("intFetch")

//Can support syntax
4.fetch(intFetch) >> 5.fetch(intFetch) //Manual sequencing

(4, 5, 6).fetchTupled(intFetch) //Applicative-like syntax
				

Pros: Very little code, syntax makes intent apparent

Cons: No concept of deduping, auto-batching

Adding deduping


//We need to cache results between fetches
type CacheMap = Map[(Any, String), Option[Any]]

//An already-ran fetch, can be used for linear deduping
final case class DedupedFetch[F[_], A](
	unsafeCache: CacheMap, last: A
)

//Supported with syntax, not as nice
intFetch.fetchDedupe(5)
  .alsoFetch(intFetch)(5)
	.alsoFetchAll(intFetch)(Set(4, 5, 6))

(4, 5, 6).fetchTupledDedupe(intFetch)
				

Pros: Now we can dedupe fetches in a strict linear order!

Cons:

  • Need even more syntax since all results are in F
  • You can't compose fetches in any order - MUST be linear
  • Still does not support implicit batching

Composing fetch requests


//Lets defer running a fetch, make it lazy!
final case class LazyFetch[F[_], A](
	k: Kleisli[F, CacheMap, DedupedFetch[F, A]]
)

//Much nicer syntax again
5.fetchLazy(intFetch) >> 6.fetchLazy(intFetch)

//Can compose fetches in any direction
val oneOrder = fetch1 >> fetch2 >> fetch3
val otherOrder = fetch2 >> fetch3 >> fetch1

//Run any time
otherOrder.run //F[DedupedFetch[F, A]]
				

We can fetch, dedupe, even interleve effects...

...But we still can't auto-batch!

A little closer...


final case class LazyFetch[F[_]: FlatMap, A](
	k: Kleisli[F, CacheMap, DedupedFetch[F, A]]
)

//Kinda looks like...

final case class OptionT[F[_], A](value: F[Option[A]])
				

We've (unintentionally) made a transformer!

Lets fix auto-batching:


//How to get a batch
type LazyBatchGetter[F[_]] = 
  (Set[Any], CacheMap) => F[DedupedFetch[F, Map[Any, Any]]]

//Fetch ID -> (IDs -> Getters)
type LazyBatchReqs[F[_]] = 
  Map[String, (Set[Any], LazyBatchGetter[F])]

final case class LazyBatch[F[_]: Applicative, A](
	reqs: LazyBatchReqs[F],
	getResult: Kleisli[F, CacheMap, DedupedFetch[F, A]]
)
				

//LazyBatch will auto-batch requests
def fetch(i: Int) = LazyBatch.single(intFetch)(i)

(fetch(1), fetch(2), fetch(3)).tupled

//LazyFetch forms a Parallel with LazyBatch
List(5, 6, 7).parTraverse(_.fetchLazy(intFetch))

//Equivalent to a manual batch
LazyBatch.many(intFetch)(Set(5, 6, 7))
				

Layers

In order of speed:

  1. Immediate fetching with manual syntax
  2. Linear deduping that gets a little clumsy
  3. Lazy monadic deduping
  4. Lazy applicative batching, incl. a Parallel instance

Other additions


//Fetch instances determine parallelism, can be hard-coded
object Fetch {
	def singleParallel[F[_]: Monad: Parallel, I, A](
		fetchId: String
	)(f: I => F[Option[A]]): Fetch[F, I, A]
}

object LazyFetch {
	//Interleve effects between individual fetches
	def liftF[F[_]: FlatMap](fa: F[A])
}
				

case class LazyFetch[F[_]: FlatMap, A](...) {
	//No requirements to run after creating
	def run: F[DedupedFetch[F, A]]
}

case class LazyBatch[F[_]: Applicative, A](...) {
	//Some still needed for LazyBatch
	def run(implicit M: Monad[F], P: Parallel[F])
}
				

Philosophy Argument

  • You can not optimize flatMap, >>-style code anyway
  • Users are more and more used to parallelizing by-default (parTupled, parTraverse)
  • Giving opportunities for optimization also means more flexibility

In Summary

  • TF is easier, faster, less code, more flexible, but...
  • You still need effects (data types) to propagate context!

Code is @ github.com/sloshy/fetchless

Slides are up @ slides.rpeters.dev/fetchless

Thank you!