Object-Oriented Haskell

Part 1: Objects, Fields, Methods, Interfaces

Oct 17, 2017

Object-oriented Haskell, you say? But isn’t Haskell a functional programming language? Aren’t functional programming and object-oriented programming mutually exclusive?

Well; no, they are not, and I will show you how it is possible to write object-oriented code in Haskell using only a minimal set of utility code, and without giving up much of Haskell’s pure functional benefits.

What Is OOP, Anyway

When people talk about OOP (Object-Oriented Programming), you will rarely see the term actually defined, and it seems that there isn’t a lot of agreement on what it really means.

Well then, for the purpose of this article, we will need a working definition; rather than going all crazy on this, I will write down a few key features for the particular flavor of object-oriented programming that I consider crucial, and then we will see how we can implement that in Haskell.

Objects. Naturally, in order to do object-oriented programming, you need objects. Objects have state (fields), and behavior (methods) associated with them.
Runtime Polymorphism. This really is the killer feature: we want objects to expose common interfaces while providing different implementations, such that we can decouple the two - the caller doesn’t need to know anything about the implementation, yet the object knows how to select the right method, and the call is dispatched at runtime.
Open Recursion. Meaning that when an object calls one of its own methods from within another method, that call, too, is dispatched at runtime, based on the actual object on which the call is made, even if the calling method was inherited from elsewhere.

Case Study: SQL Query Generation DSL

For our case study, we will implement a little EDSL for generating SQL queries, such that user code can provide queries in a backend-agnostic form, and our code will render them as backend-specific SQL query strings. For illustration purposes, we will only support simple SELECT queries; the data types to model these look like this:

data SelectQuery
  = SelectQuery
      { selectColumns :: [String]
      , selectTable :: String
      , selectWhere :: Condition
      , selectOrder :: [OrderSpec]
      , selectLimit :: Maybe Limit
      }

data Condition
  = Always
  | Equals Value Value -- WHERE a = b
  | Not Condition -- WHERE NOT a
  | IsNull Value -- WHERE a IS NULL
  | And Condition Condition -- WHERE a AND b
  | Or Condition Condition -- WHERE a OR b

data Value
  = ColumnRef String -- Reference a column
  | Literal String -- A literal value
  | Param String -- Named query parameter

data Limit
  = Limit Integer -- LIMIT n
  | LimitOfs Integer Integer -- LIMIT ofs n

data OrderSpec
  = OrderBy String AscDesc

data AscDesc = Asc | Desc

Great, this should be enough to write single-table SELECT statement with some basic support for WHERE clauses, LIMIT, and ORDER BY. For example:

myQuery = SelectQuery
            ["id", "username", "password"]
            "users"
            (Equals (ColumnRef "username") (Param "username"))
            [OrderBy "id" Asc]
            (Limit 1)

Which should render to SQL similar to this:

SELECT "id", "username", "password"
FROM "users"
WHERE "username" = :username
ORDER BY "id"
LIMIT 1

Now we want a function renderSqlQuery :: SqlDialect -> SelectQuery -> String, such that the SqlDialect determines the details of how the query gets rendered.

First step: Defining An Interface.

So far, everything we’ve done is uncontroversial plain old Haskell. Now we need to define an interface for SqlDialect, and a good way to model this is using a plain old data type:

data SqlDialect
  = SqlDialect
      { renderSqlQuery :: SqlDialect -> SelectQuery -> String
      }

We will extend this type later; note, for now that the renderSqlQuery field is a function that takes an additional argument of type SqlDialect. This is on purpose, as we will see later. For now, it means that if we want to call this function, we will need to pass the SqlDialect object twice:

myQuery = renderSqlQuery dialect dialect query

This is a bit awkward, and we will solve this in a minute. But first, we need to address one other thing.

Our interface so far is not polymorphic, all we can do is provide an SqlDialect value and do things with it, but it’s still a dumb old Haskell data type. To make things polymorphic yet type-safe, we will need some more juice; particularly, we will define a typeclass Is, which tells the compiler that a certain type implements a certain interface:

{-#LANGUAGE MultiParamTypeClasses #-}
{-#LANGUAGE TypeOperators #-}

class a `Is` b where
  cast :: a -> b

Now whenever an instance foo `Is` SqlDialect exists, we can write:

myQuery = renderSqlQuery (cast dialect) (cast dialect) query

And now we can resolve the awkwardness by writing a little function that puts these things together:

member :: cls `Is` inst => (inst -> inst -> a) -> cls -> a
member prop obj =
  prop vt vt
  where
    vt = cast obj

Or, for convenience, as a binary operator:

(==>) :: cls `Is` inst => cls -> (inst -> inst -> a) -> a
(==>) = flip member

Now we can write:

myQuery = (dialect ==> renderSqlQuery) query

…which, in something like Java, might look something like:

string myQuery = dialect.renderSqlQuery(query);

For completeness sake, let’s define:

instance a `Is` a where
  cast = id

That is, every interface implements itself.

Implementing The Interface (Part 1)

We’ll start with a naive “vanilla” SQL dialect that has no state of its own.

data VanillaSql = VanillaSql

Wow, that was easy! Well, we haven’t implemented anything yet, so let’s:

vanillaRenderSelect :: SelectQuery -> String
vanillaRenderSelect query =
  "SELECT " ++
    intercalate ", " (map vanillaQuoteColumn $ selectColumns query) ++
    " FROM " ++
    vanillaQuoteTable (selectTable query) ++
    vanillaRenderWhere (selectWhere query) ++
    vanillaRenderOrders (selectOrder query) ++
    vanillaRenderLimit (selectLimit query)

vanillaRenderWhere :: Condition -> String
vanillaRenderWhere Always = ""
vanillaRenderWhere cond = "WHERE " ++ vanillaRenderWhereCond cond

vanillaRenderWhereCond Always =
  "TRUE"
vanillaRenderWhereCond (Equals a b) =
  vanillaRenderValue a ++ " = " ++ vanillaRenderValue b
vanillaRenderWhereCond (Not cond) =
  "NOT (" ++ vanillaRenderWhereCond cond ++ ")"
vanillaRenderWhereCond (IsNull a) =
  vanillaRenderValue a ++ " IS NULL "
vanillaRenderWhereCond (And a b) =
  "(" ++ vanillaRenderWhereCond a ++ " AND " ++ vanillaRenderWhereCond b ++ ")"
vanillaRenderWhereCond (Or a b) =
  "(" ++ vanillaRenderWhereCond a ++ " OR " ++ vanillaRenderWhereCond b ++ ")"

vanillaRenderOrders [] =
  ""
vanillaRenderOrders orders =
  "ORDER BY " ++ intercalate ", " (map vanillaRenderOrder orders)

vanillaRenderOrder (Order field Asc) = vanillaQuoteColumn field
vanillaRenderOrder (Order field Desc) = vanillaQuoteColumn field ++ " DESC"

vanillaRenderLimit Nothing = ""
vanillaRenderLimit (Just (Limit n)) = "LIMIT " ++ show n
vanillaRenderLimit (Just (LimitOfs ofs n)) = "LIMIT " ++ show ofs ++ " " ++ show n

vanillaRenderValue (ColumnRef field) = vanillaQuoteColumn field
vanillaRenderValue (Param param) = vanillaQuoteParam param
vanillaRenderValue (Literal val) = vanillaQuoteLiteral val

vanillaQuoteColumn col = "\"" ++ col ++ "\""
vanillaQuoteParam = const "?"

vanillaQuoteTable table = "\"" ++ table ++ "\""

-- Not actually accurate, we would also have to perform escaping, but
-- for demonstration purposes this will have to do.
vanillaQuoteLiteral val = "'" ++ val ++ "'"

Cool. Now we could write our instance VanillaSql `Is` SqlDialect, but before we do, let’s take a step back. At some point, we will want to write other SQL dialect implementations, but they will share a lot of code with the vanilla flavor - for example, we could probably reuse most of the above for MySQL, but we would want to override the quoting behavior such that it follows the MySQL custom of using backticks for table and column names. The way we’ve written our SQL generation functions, this isn’t possible, because the vanilla renderer functions always call into other vanilla functions - we need those nested calls to somehow be aware of the runtime object they are being called on. In other words, we need runtime dispatch and open recursion, and this is why we added the additional argument to our methods in the interface definition.

Virtual Methods

In order to make our methods virtual, that is, making the choice of implementation dependent on the runtime object, we need to pass an additional copy of the object around, which we will use for open-recursive calls to other methods. By convention, we will name this additional argument this.

So, let us first extend the interface.

data SqlDialect
  = SqlDialect
      { renderSqlQuery :: SqlDialect -> SelectQuery -> String
      , renderWhere :: SqlDialect -> Condition -> String
      , renderWhereCond :: SqlDialect -> Condition -> String
      , renderOrders :: SqlDialect -> [OrderSpec] -> String
      , renderOrder :: SqlDialect -> OrderSpec -> String
      , renderValue :: SqlDialect -> Value -> String
      , renderLimit :: SqlDialect -> Maybe Limit -> String
      , quoteColumn :: SqlDialect -> String -> String
      , quoteTable :: SqlDialect -> String -> String
      , quoteParam :: SqlDialect -> String -> String
      , quoteLiteral :: SqlDialect -> String -> String
      }

Note that every method takes the additional this argument.

Implementing The Interface (Part 2)

Armed with this, we can extend our Vanilla SQL implementation to match the type signatures:

vanillaRenderSelect :: SqlDialect -> SelectQuery -> String

vanillaRenderSelect this query =
  "SELECT " ++
    intercalate ", " (map (this ==> quoteColumn) $ selectColumns query) ++
    " FROM " ++
    (this ==> quoteTable) (selectTable query) ++
    (this ==> renderWhere) (selectWhere query) ++
    (this ==> renderOrders) (selectOrder query) ++
    (this ==> renderLimit) (selectLimit query)

vanillaRenderWhere :: SqlDialect -> Condition -> String
vanillaRenderWhere this Always = ""
vanillaRenderWhere this cond = "WHERE " ++ (this ==> renderWhereCond) cond

vanillaRenderWhereCond this Always =
  "TRUE"
vanillaRenderWhereCond this (Equals a b) =
  (this ==> renderValue) a ++ " = " ++ (this ==> renderValue) b
vanillaRenderWhereCond this (Not cond) =
  "NOT (" ++ (this ==> renderWhereCond) cond ++ ")"
vanillaRenderWhereCond this (IsNull a) =
  (this ==> renderValue) a ++ " IS NULL "
vanillaRenderWhereCond this (And a b) =
  "(" ++ (this ==> renderWhereCond) a ++ " AND " ++ (this ==> renderWhereCond) b ++ ")"
vanillaRenderWhereCond this (Or a b) =
  "(" ++ (this ==> renderWhereCond) a ++ " OR " ++ (this ==> renderWhereCond) b ++ ")"

vanillaRenderOrders this [] =
  ""
vanillaRenderOrders this orders =
  "ORDER BY " ++ intercalate ", " (map (this ==> renderOrders) orders)

vanillaRenderOrder this (Order field Asc) = (this ==> quoteColumn) field
vanillaRenderOrder this (Order field Desc) = (this ==> quoteColumn) field ++ " DESC"

vanillaRenderLimit this Nothing = ""
vanillaRenderLimit this (Just (Limit n)) = "LIMIT " ++ show n
vanillaRenderLimit this (Just (LimitOfs ofs n)) = "LIMIT " ++ show ofs ++ " " ++ show n

vanillaRenderValue this (ColumnRef field) = (this ==> quoteColumn) field
vanillaRenderValue this (Param param) = (this ==> quoteParam) param
vanillaRenderValue this (Literal val) = (this ==> quoteLiteral) val

vanillaQuoteColumn this col = "\"" ++ col ++ "\""
vanillaQuoteParam this = const "?"

vanillaQuoteTable this table = "\"" ++ table ++ "\""

-- Not actually accurate, we would also have to perform escaping, but
-- for demonstration purposes this will have to do.
vanillaQuoteLiteral this val = "'" ++ val ++ "'"

And now, writing our instance is easy:

instance VanillaSql `Is` SqlDialect where
  cast VanillaSql =
    SqlDialect
      { renderSelect = vanillaRenderSelect
      , renderWhere = vanillaRenderWhere
      , renderWhereCond = vanillaRenderWhereCond
      , renderOrders = vanillaRenderOrders
      , renderOrder = vanillaRenderOrder
      , renderValue = vanillaRenderValue
      , quoteColumn = vanillaQuoteColumn
      , quoteTable = vanillaQuoteTable
      , quoteParam = vanillaQuoteParam
      , quoteLiteral = vanillaQuoteLiteral
      }

Inheritance

Now things get interesting: with the way we have separated query generation out, we can now build other SQL dialect implementations on top of our basic “vanilla” flavor, overriding methods selectively:

data MySQL = MySQL

mysqlQuoteColumn this col = "`" ++ col ++ "`"
mysqlQuoteTable this table = "`" ++ table ++ "`"

instance MySQL `Is` SqlDialect where
  cast MySQL =
    (cast VanillaSql)
      { quoteColumn = mysqlQuoteColumn
      , quoteTable = mysqlQuoteTable
      }

Now the SqlDialect instance for MySQL inherits everything from VanillaSql, except the quoting rules for columns and tables. And because we’re explicitly passing this around, the calls to quoteColumn and quoteTable are resolved based on that inside all the other calls, so the code does the right thing, just like you’d expect.

Conclusion

We’re certainly not done at this point, but so far, we have:

Objects (VanillaSql, MySQL)
Interfaces (SqlDialect)
Interface methods
Runtime polymorphism (method implementation selected based on runtime interface implementation)
Full dynamic dispatch / open recursion (methods can call back into the this object, and the correct implementation is selected at runtime)
Interface-based inheritance (we can provide interface implementations based on the interface implementations of other types); unlike most OOP languages in the wild, however, we always inherit through an interface. Our “classes” are just plain old Haskell types, they have no virtual methods and there is no direct class inheritance.

We have also managed to avoid a few problems commonly associated with OOP:

Diamond Inheritance. Interface methods are always implemented explicitly; when we inherit from multiple “classes”, we have to pick one of them as the primary one, mixing in methods from others as needed (we’ll see more of that in future installments).
The dreaded null. We retain all of Haskell’s type safety, nullability is still guarded with Maybe.

And we have retained most of the advantages of Haskell:

Type safety: we have successfully leveraged Haskell’s type system to express appropriate constraints to model our OOP paradigm; particularly, we enforce at compile time:
- …that an implementation of an interface is complete (i.e., all the methods are implemented)
- …that methods can only be called on objects that support the interface that defines them (so no MethodNotImplementedException or the like)
- …that we can only make valid casts, and thus, that casts will always succeed
- …that the this object is of the correct type
- …that effects are controlled and explicit; we are not introducing any effects through any backdoors, nor are we sweeping anything under the rug
- …that we cannot call methods through interfaces that we haven’t declared as constraints: unless we have a constraint a `Is` intf, we cannot call methods from intf on a value of type a

There are also some loose ends that we haven’t addressed yet:

Stateful classes. What patterns emerge when we add state to our objects? What does that mean for inheritance?
Visibility. Class-based OOP languages like Java, C# and C++ have visibility modifiers (public / private); how can we get similar features in idiomatic Haskell?
Mutability / state updates. Haskell is a pure functional language; this means that we can model updates as first-class actions in some EDSL (typically a monadic one like IO, State or ST), but we haven’t yet developed a strategy for implementing this in our OOP framework.
Mutability control. Once we have mutable objects, we will want to mark methods as “immutable” (that is, we want to express the expectation that the object cannot be mutated through these methods), and we want to enforce this on both sides (caller and callee).
Multi-interface classes. So far, all our classes have only ever implemented one interface, but what happens when a class implements more than one interface? How do they interact? How does inheritance work in such cases? Particularly, how do multiple interfaces play together with mutable objects?

I will go into these questions in future posts.

Oh, and by the way, the framework laid out in this blog series is also available on Hackage, under the name boop.

Revision History

Oct 17, 2017
Oct 18, 2017: Cosmetic improvements
Oct 19, 2017: Added conclusion
Oct 26, 2017: Subtitle