Tagless Final in Scala: A Practical example
Table of Contents
- Source Code
- Disclaimer
- Introduction
- Problem example
- Imperative approach
- Tagless Final Encoding
- Algebras
- Algebra's Interpreter
- Handling Errors: Sum Type to Rescue
- Testing
- Conclusion
Source Code
To get a complete version of the source code presented here, go here Github Repo.
All the examples of this post can be found on
demo-end
branch and NOT inmaster
Disclaimer
All the techniques and code that is going to be shown here are completely agnostic of any library. We are not going to use any Scala FP library such as cats, scalaz or any other. Obviously using any of those libraries could help us to implement these techniques without needing to write so much code, but the goal here is to shown how we can implement Tagless Final Encoding using only pure Scala language features.
Introduction
This is my first blog post about Scala and i would like to describe a well known topic for the whole Scala FP community which is Tagless Final Encoding.
There are so many great code examples and blog post about the subject around there, but since there is always room to understand the technique from other perspective, I decided to take the chance and maybe help others to understand it in a more practical point of view.
I am going to deep dive in the technique with a real use case example and in the middle i will mix it with more theoretical concepts. This journey is going to start from an imperative approach until we get to a Tagless Final FP approach. So, fasten your seat belts and lets enjoy the ride!!!
Problem example
I've created an example problem to be solved based on some things of my personal work, so everything in the example could be real production code. This is obviously a minimalist version of the problem.
We are going to describe a Recommender Program that given an algorithm and a user is going to generate recommendations for that user based on the selected algorithm. A Recommender system could be for example the most likely recommendations you received when visit any Marketplace such as Amazon, Ebay, or any other after searching for some products and navigate through these sites. For example: "Since you have been searching for X cars, maybe you are interested in R, Y and Z also".
We have the following requirements defined in terms of User Stories:
As an user i want to get recommendations from an specific algorithm, but if there are no recommendations for this algorithm or i forgot to specify what algorithm should be use i would like to have default recommendations from the best algorithm the system has.
As an user i want to get a message if recommendation's algorithm i requested is wrong.
As an user i want to be able to be retrieve with a limited number of recommendations.
Imperative approach
def getUser(userId: Option[Int]): Option[Int] =
userId.filter(user => users.exists(_.userId == user))
def getAlgorithm(recommenderId: Option[String]): Option[Algorithm] =
recommenderId.orElse(algoDefault).flatMap(algorithms.get(_))
def program(userId: Option[Int],
recommenderId: Option[String] = None,
limit: Option[Int] = None): Unit =
Although this imperative code version is fine, we can take the advantage of for-comprehension syntax sugar since we are manipulating Option[+A]
type.
def program(userId: Option[Int],
recommenderId: Option[String] = None,
limit: Option[Int] = None): Unit =
We can even go further and separate our program in 2 functions:
getRecommendations
: for-comprehension where the program logic takes placeprintResults
: print results or errors to the user.
def getRecommendations(userId: Option[Int],
recommenderId: Option[String],
limit: Option[Int]): Option[Result] =
def printResults(userId: Option[Int], result: Option[Result]): Unit =
private def getUser(userId: Option[Int]): Option[UserId] =
userId.filter(user => users.exists(_.userId == user)).map(UserId)
private def getAlgorithm(recommenderId: Option[String]): Option[Algorithm] =
recommenderId.orElse(algoDefault).flatMap(algorithms.get(_))
private def executeAlgorithm(user: UserId, algorithm: Algorithm): Option[UserRec] =
algorithm.run(user)
private def filterResults(result: UserRec, limitFilter: Int): Option[UserRec] =
Some(result.copy(recs = recs.slice(0, limitFilter).toList))
Now we are done with our imperative code style and we can say it is in a good shape.
But we don't like imperative style, we love doing Functional Programming and applying all the Math science we have at our disposal, don't we? So, lets start to refactor our code with Tagless Final Encoding approach.
Tagless Final Encoding
Tagless Final Encoding is a technique for embedding a DSL (Domain Specific Language) in a Type Functional Language such as Scala, Haskell, OCaml or other similar. Since we are embedding a DSL (a.k.a. Language) we are defining a Language in order to use its syntax and semantic (Algebra) in our program. We are defining also an interpreter of the Language to indicate how it should behave on each used term (Interpreter).
If you want to visit a more advanced approach of this, I would strongly recommend after this lecture to take a look on Free Monads.
We can say that in Tagless Final style there are:
Algebras: Set of operations over a Structure. In a Programming language idiom could be a set of functions that operates over some Type.
Interpreter: The way of those operations behave according to an specific Type. In a Programming language idiom the implementation of those functions depending on the specific Type.
Algebras
The first thing to do with this technique is to define our Algebras, our operations that are needed to solve our domain problem. If you already have some program in a good shape but with an imperative style approach like describe it above, it is quite easy to define these operations because it is already provided by the semantic of the for-comprehension. In our case we have the following operations:
def getUser(..)
def getAlgorithm(..)
def executeAlgorithm(..)
def filter(...)
Now the only thing left to build our Algebra is to group those operation in different Algebras depending on the structure they are operating on:
- User: operations related to handle users
- Algorithm: operations related to handle algorithms
- Filter: operation for filtering results
Since our Algebras are only definition of our operations we are going to use Trait for defining this abstract representation.
Here we have a couple of things perhaps new for the reader:
We are defining our Algebras in Traits. So far so good. Nothing new or fancy.
We are defining our Types with Higher-Kinded Types parameters (
F[_]
) to abstract out the Container Structure that it is going to be use in each Interpreter. A Higher-Kinded Type or Type Constructor is a Type which constructs a new Type based on a Type Parameter. For exampleOption[+A]
is a Type constructor which takes a Type, for exampleString
and constructs the final Type, for exampleOption[String]
. You can probe this in a Scala console with :kind command:
scala> :k String
String's kind is A
scala> :k Option
Option's kind is F[+A]
scala> :k Option[String]
Option[String]'s kind is A
- A last thing we are adding here in companion objects are a technique called Summoned values, which allow us to get the implicit value using the Companion Object Trait's constructor.
With all this machinery in place we can add some utility functions to avoid calling companion objects and just calling functions from the client program:
def getUser(userId: Option[Int]): F[UserId] =
UserRepo.getUser(userId)
def filter(userRec: UserRec, limit: Int): F[UserRec] =
Filter.filter(userRec, limit)
def getAlgorithm(recommenderId: Option[String]): F[Algorithm] =
AlgorithmRepo.getAlgorithm(recommenderId)
def execute(algo: Algorithm, userId: UserId): F[UserRec] =
AlgorithmRepo.execute(algo, userId)
Now it is time to use our Algebras from getRecommendations
program. In order to do that we are going to use Context Bounds allowing the compiler to infer implicit values from context.
def getRecommendations
( userId: Option[Int],
recommenderId: Option[String],
limit: Option[Int]): Option[Result] =
Algebra's Interpreter
Although it seems we have arrived to an acceptable solution we are not ready yet. We need to have at least one interpreter which is the real implementation of our Algebra. Recall that we are resting everything on Option[+A]
for the moment, so our interpreter should be on Option[+A]
Type.
One possible interpreter could be:
We need to do for letting the compiler infer the implicit values it is just importing our implicit values on the context.
def program(userId: Option[Int],
recommenderId: Option[String] = None,
limit: Option[Int] = None): Unit =
Have you notice that we are calling getRecommendations
with Option[+A]
as the Higher-Kinded Type or Type Constructor?. Since we have indicated in getRecommendations
that F[_]
is context bounded by UserRepo and AlgorithmRepo and Filter
, and
$$ F[\_] \cong Option[+A] $$
the compiler should look for an implicit
value for each Algebras whose F
is Option[+A]
. And that is what exactly we have provided to the compiler. But is this code compile and runs or only runs? Lets check
λx.x> $ sbt run
[error] value flatMap is not a member of type parameter F[program.DataSource.UserId]
[error] user <- getUser(userId)
[error] ^
[error] value flatMap is not a member of type parameter F[program.DataSource.Algorithm]
[error] algorithm <- getAlgorithm(recommenderId)
[error] ^
[error] two errors found
Why is this happening? Because getRecommendations
is Context Bound or Constrained on UserRepo: AlgorithmRepo: Filter
and for-comprehension in Scala is only a syntactic sugar for flatMap
, map
and withFilter
. Of course Option[+A]
type implements those methods and can be used in a for-comprehension syntax, but for-comprehension used on getRecommendations
doesn't know anything about Option[+A]
or any other Type until it is bound in program
function.
Remember as i mentioned in Disclaimer section that we are not using any extra FP library to do the job. If we are using cats or scalaz we can easily Constraint
getRecommendations
withFlatMap
orMonad
Typeclasses
Program Syntax: For-Comprehension aware
We need a way to tell the compiler that getRecommendations
supports for-comprehension syntax. We can do that creating an Algebra to support that syntax.
I would like to point out that this part of the code is inspired on John De Goes Talk FP to the Max and i would strongly encourage you to watch this video.
- In our Algebra we are going to have this:
implicit (fa: F[A])
- An in our Interpreter we need resolution bind for
Option[+A]
:
implicit extends Program[Option]
- Last thing to do is to add
Program
Constraint togetRecommendations
:
def getRecommendations
(userId: Option[Int],
recommenderId: Option[String],
limit: Option[Int]): F[Result] =
Now the code compiles and run!!
Printing results: The right way
Recalling to our printing results function. In this case our fold
over result is working because we are expecting an Option[+A]
result, but getRecommendations
is agnostic in that sense and printResults
should also be.
def printResults(userId: Option[Int], result: F[Result]): Unit =
This is a better version but our program doesn't compile because Program
Algebra doesn't have defined a fold
operation. So lets do it:
implicit (fa: F[A])
Now we have defined our fold
operation which is going to fold over F
. It is time to add the interpretation of this operation:
implicit extends Program[Option]
The code compiles and runs again with an abstract version of printResults
. Lets see some examples:
λx.x> $ sbt run
[info] Done packaging.
[info] Running (fork) program.ToScalaFP
[info] Recommnedations for userId UserId(1)...
[info] Algorithm algo1
[info] Recs: List(Rec(a,0.054459512), Rec(b,0.8465745), Rec(c,0.656385),
Rec(d,0.13308495), Rec(e,0.7825986), Rec(f,0.29209626), Rec(g,0.4820329),
Rec(h,0.1532129), Rec(i,0.16719013), Rec(j,0.9551664))
[info] ------------------------------
[info] Recommnedations for userId UserId(2)...
[info] Algorithm algo2
[info] Recs: List(Rec(a,0.054459512), Rec(b,0.8465745), Rec(c,0.656385),
Rec(d,0.13308495), Rec(e,0.7825986))
[info] ------------------------------
[info] Error Unexpected Error
[info] ------------------------------
[info] Error Unexpected Error
[info] ------------------------------
[info] Error Unexpected Error
[info] ------------------------------
[info] Error Unexpected Error
[info] ------------------------------
[success] Total time: 9 s, completed Feb 17, 2019 8:45:15 PM
It doesn't look accurate, does it? Error cases such as userId not found, no recommendations found and so on are not displayed and the messages for the user are vague in those cases. This is because we are dealing with Option[+A]
type and it doesn't give us the expressiveness we need to notify the user with the exact errors on our program.
Handling Errors: Sum Type to Rescue
We need to alert program's user about what specific errors have been found during the execution. For that purpose it will be great to have our execution in terms of Either[+A,+B]
instead of Option[+A]
.
The cost of doing that with all the machinery we have defined until now is minimum because of the following:
Either[+A,+B]
is a Higher-Kinded Type but with 2 Type parameters instead of 1 as our Algebras are requesting. We are going to see in a minute how to solve that problem.We only need to write an interpreter for that Type, bind
getRecommendations
call withEither[+A,+B]
and let the compiler use the correct interpreter on runtime for us.
Lambda Types
Lets try to figure out our first problem which is how to bind a Type Constructor with 1 Type parameters (Either[+A,+B]
) in a definition with only 1 Type parameter (F[_]
)
$$ F[\_] \ncong Either[+A,+B] $$
We can check this incongruence in Scala console very easily
scala> :kind Either
Either's kind is F[+A1,+A2]
As we can appreciate Either[+A,+B]
has kind F[+A,+B]
and we are asking a kind F[+A]
, but we cannot change our Algebras to support F[+A,+B]
because it is not going to accept anymore Option[+A]
and we want to support both. Instead of changing our algebra we are going to adapt Either[+A,+B]
to be a Type Constructor with 1 Type parameter. For that job we are going to use Lambda Types.
Basically a Lambda Type is similar to a Partially applied function but at a Type Level. We can curry our 2 parameter Type Constructor to obtain another 1 parameter Type Constructor. This is how it is done:
scala> :kind ({ type T[A] = Either[AppError, A] })#T
scala.util.Either[AppError,?]'s kind is F[+A]
As we can see in Scala console example we are fixing Left Either
parameter type with AppError
since all errors we are going to generate are subtypes of this, and let this phantom type be parameterized only in its Right value Type which is the Type it is going to change during execution.
In our code we are going to use kind-projector compiler plugin to avoid this boilerplate syntax.
With kind-projector we can have a more readable Lambda Type like this:
scala> :kind Either[AppError, ?]
scala.util.Either[AppError,?]'s kind is F[+A]
Interpreter for Either[+A,+B]
Now the only thing missing is to write our interpreter for this new type:
And we can execute both programs running on different interpreters at the same time:
def program(userId: Option[Int],
recommenderId: Option[String] = None,
limit: Option[Int] = None): Unit =
If we run this program now we can compare detailed error when we interpret the Algebra with Either[+A,+B]
against unknown errors with Option[+A]
:
λx.x> $ sbt run
Recommnedations for userId UserId(1)...
Algorithm algo1
Recs: List(Rec(a,0.66836), Rec(b,0.8242624), Rec(c,0.74691266),
Rec(d,0.9902125), Rec(e,0.775927), Rec(f,0.015915632), Rec(g,0.19724733),
Rec(h,0.92668074), Rec(i,0.2997946), Rec(j,0.1962437))
Recommnedations for userId UserId(1)...
Algorithm algo1
Recs: List(Rec(a,0.66836), Rec(b,0.8242624), Rec(c,0.74691266),
Rec(d,0.9902125), Rec(e,0.775927), Rec(f,0.015915632), Rec(g,0.19724733),
Rec(h,0.92668074), Rec(i,0.2997946), Rec(j,0.1962437))
------------------------------
Recommnedations for userId UserId(2)...
Algorithm algo2
Recs: List(Rec(a,0.66836), Rec(b,0.8242624), Rec(c,0.74691266),
Rec(d,0.9902125), Rec(e,0.775927))
Recommnedations for userId UserId(2)...
Algorithm algo2
Recs: List(Rec(a,0.66836), Rec(b,0.8242624), Rec(c,0.74691266),
Rec(d,0.9902125), Rec(e,0.775927))
------------------------------
Error Algorithm not found for id algo5
Error Unexpected Error
------------------------------
Error User not found for id UserId(14)
Error Unexpected Error
------------------------------
Error User id must be provided
Error Unexpected Error
------------------------------
Error Recommendations not found for UserId(1) with algorithm 'algo3'
Error Unexpected Error
------------------------------
Process finished with exit code 0
Testing
With this approach we can test our code in a straight forward way. We only need to provide a Test
1 parameter Type constructor and write our interpreters. Thats all.
To have an idea of this approach to that this could be done like this:
// Provide Type parameter Test which wraps a value
(value: A)
// Provide Interpreters for example for userRepo
implicit extends UserRepo[Test]
Conclusion
Tagless Final Encoding is a very good technique to encode DSL and separate the interpretation of DSL definition from implementation in a pure Functional way and as i pointed out on Disclaimer section, implementing this technique with some Scala FP library such as cats, scalaz or any other do the work strait forward and easy, removing a lot of boilerplate code we are using here, specially Program Algebra