Object-Oriented Haskell
Part 1: Objects, Fields, Methods, Interfaces
Oct 17, 2017
Object-oriented Haskell, you say? But isn’t Haskell a functional programming language? Aren’t functional programming and object-oriented programming mutually exclusive?
Well; no, they are not, and I will show you how it is possible to write object-oriented code in Haskell using only a minimal set of utility code, and without giving up much of Haskell’s pure functional benefits.
What Is OOP, Anyway
When people talk about OOP (Object-Oriented Programming), you will rarely see the term actually defined, and it seems that there isn’t a lot of agreement on what it really means.
Well then, for the purpose of this article, we will need a working definition; rather than going all crazy on this, I will write down a few key features for the particular flavor of object-oriented programming that I consider crucial, and then we will see how we can implement that in Haskell.
- Objects. Naturally, in order to do object-oriented programming, you need objects. Objects have state (fields), and behavior (methods) associated with them.
- Runtime Polymorphism. This really is the killer feature: we want objects to expose common interfaces while providing different implementations, such that we can decouple the two - the caller doesn’t need to know anything about the implementation, yet the object knows how to select the right method, and the call is dispatched at runtime.
- Open Recursion. Meaning that when an object calls one of its own methods from within another method, that call, too, is dispatched at runtime, based on the actual object on which the call is made, even if the calling method was inherited from elsewhere.
Case Study: SQL Query Generation DSL
For our case study, we will implement a little EDSL for generating SQL queries, such that user code can provide queries in a backend-agnostic form, and our code will render them as backend-specific SQL query strings. For illustration purposes, we will only support simple SELECT
queries; the data types to model these look like this:
data SelectQuery
= SelectQuery
selectColumns :: [String]
{ selectTable :: String
, selectWhere :: Condition
, selectOrder :: [OrderSpec]
, selectLimit :: Maybe Limit
,
}
data Condition
= Always
| Equals Value Value -- WHERE a = b
| Not Condition -- WHERE NOT a
| IsNull Value -- WHERE a IS NULL
| And Condition Condition -- WHERE a AND b
| Or Condition Condition -- WHERE a OR b
data Value
= ColumnRef String -- Reference a column
| Literal String -- A literal value
| Param String -- Named query parameter
data Limit
= Limit Integer -- LIMIT n
| LimitOfs Integer Integer -- LIMIT ofs n
data OrderSpec
= OrderBy String AscDesc
data AscDesc = Asc | Desc
Great, this should be enough to write single-table SELECT
statement with some basic support for WHERE
clauses, LIMIT
, and ORDER BY
. For example:
= SelectQuery
myQuery "id", "username", "password"]
["users"
Equals (ColumnRef "username") (Param "username"))
(OrderBy "id" Asc]
[Limit 1) (
Which should render to SQL similar to this:
SELECT "id", "username", "password"
FROM "users"
WHERE "username" = :username
ORDER BY "id"
LIMIT 1
Now we want a function renderSqlQuery :: SqlDialect -> SelectQuery -> String
, such that the SqlDialect
determines the details of how the query gets rendered.
First step: Defining An Interface.
So far, everything we’ve done is uncontroversial plain old Haskell. Now we need to define an interface for SqlDialect
, and a good way to model this is using a plain old data
type:
data SqlDialect
= SqlDialect
{ renderSqlQuery :: SqlDialect -> SelectQuery -> String
}
We will extend this type later; note, for now that the renderSqlQuery
field is a function that takes an additional argument of type SqlDialect
. This is on purpose, as we will see later. For now, it means that if we want to call this function, we will need to pass the SqlDialect
object twice:
= renderSqlQuery dialect dialect query myQuery
This is a bit awkward, and we will solve this in a minute. But first, we need to address one other thing.
Our interface so far is not polymorphic, all we can do is provide an SqlDialect
value and do things with it, but it’s still a dumb old Haskell data type. To make things polymorphic yet type-safe, we will need some more juice; particularly, we will define a typeclass Is
, which tells the compiler that a certain type implements a certain interface:
{-#LANGUAGE MultiParamTypeClasses #-}
{-#LANGUAGE TypeOperators #-}
class a `Is` b where
cast :: a -> b
Now whenever an instance foo `Is` SqlDialect
exists, we can write:
= renderSqlQuery (cast dialect) (cast dialect) query myQuery
And now we can resolve the awkwardness by writing a little function that puts these things together:
member :: cls `Is` inst => (inst -> inst -> a) -> cls -> a
=
member prop obj
prop vt vtwhere
= cast obj vt
Or, for convenience, as a binary operator:
(==>) :: cls `Is` inst => cls -> (inst -> inst -> a) -> a
==>) = flip member (
Now we can write:
= (dialect ==> renderSqlQuery) query myQuery
…which, in something like Java, might look something like:
string myQuery = dialect.renderSqlQuery(query);
For completeness sake, let’s define:
instance a `Is` a where
= id cast
That is, every interface implements itself.
Implementing The Interface (Part 1)
We’ll start with a naive “vanilla” SQL dialect that has no state of its own.
data VanillaSql = VanillaSql
Wow, that was easy! Well, we haven’t implemented anything yet, so let’s:
vanillaRenderSelect :: SelectQuery -> String
=
vanillaRenderSelect query "SELECT " ++
", " (map vanillaQuoteColumn $ selectColumns query) ++
intercalate " FROM " ++
++
vanillaQuoteTable (selectTable query) ++
vanillaRenderWhere (selectWhere query) ++
vanillaRenderOrders (selectOrder query)
vanillaRenderLimit (selectLimit query)
vanillaRenderWhere :: Condition -> String
Always = ""
vanillaRenderWhere = "WHERE " ++ vanillaRenderWhereCond cond
vanillaRenderWhere cond
Always =
vanillaRenderWhereCond "TRUE"
Equals a b) =
vanillaRenderWhereCond (++ " = " ++ vanillaRenderValue b
vanillaRenderValue a Not cond) =
vanillaRenderWhereCond ("NOT (" ++ vanillaRenderWhereCond cond ++ ")"
IsNull a) =
vanillaRenderWhereCond (++ " IS NULL "
vanillaRenderValue a And a b) =
vanillaRenderWhereCond ("(" ++ vanillaRenderWhereCond a ++ " AND " ++ vanillaRenderWhereCond b ++ ")"
Or a b) =
vanillaRenderWhereCond ("(" ++ vanillaRenderWhereCond a ++ " OR " ++ vanillaRenderWhereCond b ++ ")"
=
vanillaRenderOrders [] ""
=
vanillaRenderOrders orders "ORDER BY " ++ intercalate ", " (map vanillaRenderOrder orders)
Order field Asc) = vanillaQuoteColumn field
vanillaRenderOrder (Order field Desc) = vanillaQuoteColumn field ++ " DESC"
vanillaRenderOrder (
Nothing = ""
vanillaRenderLimit Just (Limit n)) = "LIMIT " ++ show n
vanillaRenderLimit (Just (LimitOfs ofs n)) = "LIMIT " ++ show ofs ++ " " ++ show n
vanillaRenderLimit (
ColumnRef field) = vanillaQuoteColumn field
vanillaRenderValue (Param param) = vanillaQuoteParam param
vanillaRenderValue (Literal val) = vanillaQuoteLiteral val
vanillaRenderValue (
= "\"" ++ col ++ "\""
vanillaQuoteColumn col = const "?"
vanillaQuoteParam
= "\"" ++ table ++ "\""
vanillaQuoteTable table
-- Not actually accurate, we would also have to perform escaping, but
-- for demonstration purposes this will have to do.
= "'" ++ val ++ "'" vanillaQuoteLiteral val
Cool. Now we could write our instance VanillaSql `Is` SqlDialect
, but before we do, let’s take a step back. At some point, we will want to write other SQL dialect implementations, but they will share a lot of code with the vanilla flavor - for example, we could probably reuse most of the above for MySQL, but we would want to override the quoting behavior such that it follows the MySQL custom of using backticks for table and column names. The way we’ve written our SQL generation functions, this isn’t possible, because the vanilla renderer functions always call into other vanilla functions - we need those nested calls to somehow be aware of the runtime object they are being called on. In other words, we need runtime dispatch and open recursion, and this is why we added the additional argument to our methods in the interface definition.
Virtual Methods
In order to make our methods virtual, that is, making the choice of implementation dependent on the runtime object, we need to pass an additional copy of the object around, which we will use for open-recursive calls to other methods. By convention, we will name this additional argument this
.
So, let us first extend the interface.
data SqlDialect
= SqlDialect
renderSqlQuery :: SqlDialect -> SelectQuery -> String
{ renderWhere :: SqlDialect -> Condition -> String
, renderWhereCond :: SqlDialect -> Condition -> String
, renderOrders :: SqlDialect -> [OrderSpec] -> String
, renderOrder :: SqlDialect -> OrderSpec -> String
, renderValue :: SqlDialect -> Value -> String
, renderLimit :: SqlDialect -> Maybe Limit -> String
, quoteColumn :: SqlDialect -> String -> String
, quoteTable :: SqlDialect -> String -> String
, quoteParam :: SqlDialect -> String -> String
, quoteLiteral :: SqlDialect -> String -> String
, }
Note that every method takes the additional this
argument.
Implementing The Interface (Part 2)
Armed with this, we can extend our Vanilla SQL implementation to match the type signatures:
vanillaRenderSelect :: SqlDialect -> SelectQuery -> String
=
vanillaRenderSelect this query "SELECT " ++
", " (map (this ==> quoteColumn) $ selectColumns query) ++
intercalate " FROM " ++
==> quoteTable) (selectTable query) ++
(this ==> renderWhere) (selectWhere query) ++
(this ==> renderOrders) (selectOrder query) ++
(this ==> renderLimit) (selectLimit query)
(this
vanillaRenderWhere :: SqlDialect -> Condition -> String
Always = ""
vanillaRenderWhere this = "WHERE " ++ (this ==> renderWhereCond) cond
vanillaRenderWhere this cond
Always =
vanillaRenderWhereCond this "TRUE"
Equals a b) =
vanillaRenderWhereCond this (==> renderValue) a ++ " = " ++ (this ==> renderValue) b
(this Not cond) =
vanillaRenderWhereCond this ("NOT (" ++ (this ==> renderWhereCond) cond ++ ")"
IsNull a) =
vanillaRenderWhereCond this (==> renderValue) a ++ " IS NULL "
(this And a b) =
vanillaRenderWhereCond this ("(" ++ (this ==> renderWhereCond) a ++ " AND " ++ (this ==> renderWhereCond) b ++ ")"
Or a b) =
vanillaRenderWhereCond this ("(" ++ (this ==> renderWhereCond) a ++ " OR " ++ (this ==> renderWhereCond) b ++ ")"
=
vanillaRenderOrders this [] ""
=
vanillaRenderOrders this orders "ORDER BY " ++ intercalate ", " (map (this ==> renderOrders) orders)
Order field Asc) = (this ==> quoteColumn) field
vanillaRenderOrder this (Order field Desc) = (this ==> quoteColumn) field ++ " DESC"
vanillaRenderOrder this (
Nothing = ""
vanillaRenderLimit this Just (Limit n)) = "LIMIT " ++ show n
vanillaRenderLimit this (Just (LimitOfs ofs n)) = "LIMIT " ++ show ofs ++ " " ++ show n
vanillaRenderLimit this (
ColumnRef field) = (this ==> quoteColumn) field
vanillaRenderValue this (Param param) = (this ==> quoteParam) param
vanillaRenderValue this (Literal val) = (this ==> quoteLiteral) val
vanillaRenderValue this (
= "\"" ++ col ++ "\""
vanillaQuoteColumn this col = const "?"
vanillaQuoteParam this
= "\"" ++ table ++ "\""
vanillaQuoteTable this table
-- Not actually accurate, we would also have to perform escaping, but
-- for demonstration purposes this will have to do.
= "'" ++ val ++ "'" vanillaQuoteLiteral this val
And now, writing our instance is easy:
instance VanillaSql `Is` SqlDialect where
VanillaSql =
cast SqlDialect
= vanillaRenderSelect
{ renderSelect = vanillaRenderWhere
, renderWhere = vanillaRenderWhereCond
, renderWhereCond = vanillaRenderOrders
, renderOrders = vanillaRenderOrder
, renderOrder = vanillaRenderValue
, renderValue = vanillaQuoteColumn
, quoteColumn = vanillaQuoteTable
, quoteTable = vanillaQuoteParam
, quoteParam = vanillaQuoteLiteral
, quoteLiteral }
Inheritance
Now things get interesting: with the way we have separated query generation out, we can now build other SQL dialect implementations on top of our basic “vanilla” flavor, overriding methods selectively:
data MySQL = MySQL
= "`" ++ col ++ "`"
mysqlQuoteColumn this col = "`" ++ table ++ "`"
mysqlQuoteTable this table
instance MySQL `Is` SqlDialect where
MySQL =
cast VanillaSql)
(cast = mysqlQuoteColumn
{ quoteColumn = mysqlQuoteTable
, quoteTable }
Now the SqlDialect
instance for MySQL
inherits everything from VanillaSql
, except the quoting rules for columns and tables. And because we’re explicitly passing this
around, the calls to quoteColumn
and quoteTable
are resolved based on that inside all the other calls, so the code does the right thing, just like you’d expect.
Conclusion
We’re certainly not done at this point, but so far, we have:
- Objects (
VanillaSql
,MySQL
) - Interfaces (
SqlDialect
) - Interface methods
- Runtime polymorphism (method implementation selected based on runtime interface implementation)
- Full dynamic dispatch / open recursion (methods can call back into the
this
object, and the correct implementation is selected at runtime) - Interface-based inheritance (we can provide interface implementations based on the interface implementations of other types); unlike most OOP languages in the wild, however, we always inherit through an interface. Our “classes” are just plain old Haskell types, they have no virtual methods and there is no direct class inheritance.
We have also managed to avoid a few problems commonly associated with OOP:
- Diamond Inheritance. Interface methods are always implemented explicitly; when we inherit from multiple “classes”, we have to pick one of them as the primary one, mixing in methods from others as needed (we’ll see more of that in future installments).
- The dreaded
null
. We retain all of Haskell’s type safety, nullability is still guarded withMaybe
.
And we have retained most of the advantages of Haskell:
- Type safety: we have successfully leveraged Haskell’s type system to express appropriate constraints to model our OOP paradigm; particularly, we enforce at compile time:
- …that an implementation of an interface is complete (i.e., all the methods are implemented)
- …that methods can only be called on objects that support the interface that defines them (so no
MethodNotImplementedException
or the like) - …that we can only make valid casts, and thus, that casts will always succeed
- …that the
this
object is of the correct type - …that effects are controlled and explicit; we are not introducing any effects through any backdoors, nor are we sweeping anything under the rug
- …that we cannot call methods through interfaces that we haven’t declared as constraints: unless we have a constraint
a `Is` intf
, we cannot call methods fromintf
on a value of typea
There are also some loose ends that we haven’t addressed yet:
- Stateful classes. What patterns emerge when we add state to our objects? What does that mean for inheritance?
- Visibility. Class-based OOP languages like Java, C# and C++ have visibility modifiers (
public
/private
); how can we get similar features in idiomatic Haskell? - Mutability / state updates. Haskell is a pure functional language; this means that we can model updates as first-class actions in some EDSL (typically a monadic one like
IO
,State
orST
), but we haven’t yet developed a strategy for implementing this in our OOP framework. - Mutability control. Once we have mutable objects, we will want to mark methods as “immutable” (that is, we want to express the expectation that the object cannot be mutated through these methods), and we want to enforce this on both sides (caller and callee).
- Multi-interface classes. So far, all our classes have only ever implemented one interface, but what happens when a class implements more than one interface? How do they interact? How does inheritance work in such cases? Particularly, how do multiple interfaces play together with mutable objects?
I will go into these questions in future posts.
Oh, and by the way, the framework laid out in this blog series is also available on Hackage, under the name boop.