Swift Parsers - Functional
In the previous post we built a decode
function to parse data out from XML and into an Animal
Model using Imperative techniques. This required some efforts in order to satisfy some of the robustness requirements from the first post.
In this post we’ll cover how we can use Functional Programming techniques on top of the language features of Swift to decode XML to an Animal
model. This post assumes that you are comfortable with the Parameterized Types and Generics. Sample Code for is available on GitHub.
Thinking Functionally
I’m using the fantastic and lightweight Swiftz for go-to implementations1 of many Higher-Order Functions. It also has a great implementation of the Result
type from the previous post.
One important concept in Functional Programming is that functions can be thought of as First-Class types just like any other variable or constant in code. Objective-C has had blocks for some time now, allowing us to think in terms of passing functions around; assigning a block to a property of a Object. However this isn’t the same thing as First-Class functions. Swift allows Functions to be First-Class whilst providing equivalence in functions no matter how they are declared, a Closure in Swift is just another kind of function just like a method or Free Function. Blocks are no longer the best way of passing around a computation, we can consider the equivalence of functions declared with func
in a Global or Class scope with local Closures:
let intToNumber: Int -> NSNumber = NSNumber.numberWithInt
let intToNumber1: Int -> NSNumber = { NSNumber.numberWithInt($0) }
By now it should be second nature to think of the concatenation of two Strings using the + operator. As functions are types like any other, two functions can be joined together just like a String.
public func •<A, B, C>(f: B -> C, g: A -> B) -> A -> C // The 'compose' operator from Swiftz
func prefixer(prefix: String)(string: String) -> String { // Equivalent to String -> String -> String
return prefix + string
}
func postfixer(postfix: String)(bar: String) -> String { // Equivalent to String -> String -> String
return string + postfix
}
let happyPrefix = prefixer("😃") // String -> String. The first String argument of 'prefixer' is applied.
let sadPostfix = postfixer("😞") // String -> String. The first String argument of 'postfixer' is applied.
let infixer = happyPrefix • sadPostfix // String -> String. Two functions are joined, the output of the first becomes the input of the second.
let string = infixer("Good Morning") // "😃 Good Morning 😞"
In this example, the Swiftz compose operator (•) is used in conjunction with Curried Functions. This might look a little crazy for now, but two important concepts are:
- A specialized function can be created from a curried function, by applying some, but not all of the arguments.
- New functions can be created locally by composing other functions with fancy operators.
Building new Parsers
With the idea of composing functions without having to declare one can be carried over to our problem of decoding XML to a Model; A specialized Parser function can be made for each of the values that need to be parsed out in the Model. In order to construct an Animal
Model three parsers are required, with the values placed into the context of a Result
for failure information:
let kindParser: XMLParsableType -> Result<String> = ...
let nameParser: XMLParsableType -> Result<String> = ...
let urlParser: XMLParsableType -> Result<String> = ...
Further, we can think of each of the properties of a decoded Animal
as the application of the above functions to the source XMLParsableType
from the XML Document.
let kind: Result<String> = kindParser(xml)
let name: Result<String> = nameParser(xml)
let url: Result<String> urlParser(xml)
We’ve allready seen that we can use Swift’s Optional Chaining in our parser to limit the number of occurences of handling failure. However, Result
isn’t a blessed by the language with special syntax for chaining. It would be great to get the same behaviour for the Result type.
Turns out that Swiftz has the following function declared2:
public func >>-<A, B>(a: Result<A>, f: A -> Result<B>) -> Result<B>
It is called ‘Bind’ and it can be described in the following way:
“If the parameter a is a Value case of the result apply the function f returning the Result of f. If the parameter a is an Error, just return the original Result”
Sounds the same as Optional Chaining! In fact Optional Chaining can be defined in this way. It should be possible to construct a Result<String>
for the kind
value of an Animal
this way:
let kind: Result<String> = xml.parseChildren("kind").first >>- { $0.parseText() } // Compiler Error!
Damn, looks like the types don’t match up since the kind
value should be a Result<String>
instead of a String?
. Besides, this looks much worse with needing to use an inline closure to invert the parseText
0-arg instance method on xml
into a function that takes an xml
parameter and executes parseText
. There has to be a better way of doing this…
XMLParser: A Helper
A new approach is to create a few Helper Functions that build on top of the basic functionality of XMLParsableType
3 to extract a Single child of a given Element name, providing the value in the context of a Result
type, and accept the XMLParsableType
as a parameter of function that isn’t bound to an instance:
public final class XMLParser {
public class func parseChild<X: XMLParsableType>(elementName: String)(xml: X) -> Result<X>
public class func parseText<X: XMLParsableType>(xml: X) -> Result<String>
}
This can then be used to extract out the kind
value with our understanding of the Bind
4 operator:
let kind: Result<String> = XMLParser.parseChild("kind")(xml: xml) >>- XMLParser.parseText
Now we’re talking! The XMLParser.parseText
function is not invoked directly and is placed into a chain of operations to extract out text of an XML Element. These functions are being joined as if it were any other type of data.
The name
property of the Animal
model is nested a few XML Elements deep, but it can be extracted by chaining the Parsing functions in the same way:
let name: Result<String> = XMLParser.parseChild("nested_nonsense")(xml: xml) >>- XMLParser.parseChild("name") >>- XMLParser.parseText
The XMLParser.parseChild("name")
expression evaluates to a function of XMLParsableType -> Result<XMLParsableType>
instead of just a Result
value. We’re seeing that currying is being used to create a specialized Parser function for a specific XML element, chained together in a sequence of functions.
This is the heart of Function Composition, with Bind
performing the behaviour of continuing on success, bailing out as soon as an Error occurs. The possibility of failure within any of the sequences of operations is an essential property of this Parser. Previously this was handled by Optional Chaining and if
statements in the Imperative version. Using >>-
there isn’t a branching statement in sight5.
Common Operations
There appeares to be some common chained functions appearing. These can be extracted out and into the XMLParser
class6 and built using the same Higher-Order Functions and other functions and methods that have been defined for XMLParserType
extension XMLParser {
public class func parseChildText<X: XMLParsableType>(elementName: String)(xml: X) -> Result<String> {
let textParser: X -> Result<String> = promoteXmlError("Could not parse text for child \(elementName)") • { $0.parseText()}
return self.parseChild(elementName)(xml: xml) >>- textParser
}
public class func parseChildRecursive<X: XMLParsableType>(elementNames: [String])(xml: X) -> Result<X> {
return elementNames.reduce(Result.value(xml)) { (parsable: Result<X>, currentElementName) in
return parsable >>- self.parseChild(currentElementName)
}
}
public class func parseChildRecusiveText<X: XMLParsableType>(elementNames: [String])(xml: X) -> Result<String> {
let textParser: X -> Result<String> = promoteXmlError("Could not parse text for child \(elementNames.last)") • { $0.parseText()}
return self.parseChildRecursive(elementNames)(xml: xml) >>- textParser
}
}
reduce
is another Higher-Order function of the Sequence
type making an appearance. Using it means that the parseChildRecursive
doesn’t need to be implemented in a recursive manner.
These functions can now be used with in the Animal decode
function to make extracting data from the XML more obvious:
let kind: Result<String> = XMLParser.parseChildText("kind")(xml: xml)
let name: Result<String> = XMLParser.parseChildTextRecursive(["nested_nonsense", "name"])
Lovely.
Parsers for Other Types
We’ve seen that the definitions inside the XMLParser
Helper don’t provide methods for interpreting the String
of a XML Text element as every possible Type as Numeric and Complex types within an XML document are represented in textual form. Our Model classes are concerned with Numeric types such as Double
and Int
, so the parsing of these types will need to be incorporated into the decode
method.
Unlike JSON, there is no explicit syntax for a Numeric value, so the interpretation of these types isn’t a requirement of an XML Parser. Instead of bloating the XMLParser
class with every variation of interpreting Text as a other Types, the Parser functions for values in the Model can be composed with XMLParser
functions and other functions that interpret String
s as the other types.
Coercing a value from a String to a Numeric type may not always work. The String ‘14123’ can be interpreted as an Int
but the value 134djk23
cannot. Again, this falls into our notion of Model decoding failure. The NSNumberFormatter
class is a Cocoa way of interpreting Strings as Numbers, we can write an extension for the Int
type to intepret a String
as an Int
, with Optional.None
used to representing failure.
public extension Int {
public static func parseString(string: String) -> Int? {
let formatter = NSNumberFormatter()
let number = formatter.numberFromString(string)
return number?.integerValue
}
}
As previously mentioned, Cocoa APIs in Swift expose failure as the nil
/.None
case of an Optional Type[^nserror-failure]. However, our Parser requires the additional information in a Result
type. One approach is to extend Cocoa classes with additional methods that return a Result
instead of an Optional
, but this might not be the ideal solution7. Instead we can again think in terms of Function Composition to make a function that returns Result<Int>
instead of Int?
For example, a definition of toiletCount
requires interpreting a String
in the XML as an Int
in the model. This function can be built from closures:
let toiletCountParser: String -> Result<Int> = { promoteDecodeError("Could not parse 'disabled_parking")(value: Int.parseString($0)) }
Or we can use the Compose
and Bind
Operators again:
let toiletCountParser: String -> Result<Int> = promoteDecodeError("Could not parse 'disabled_parking") • Int.parseString
let toiletCount = XMLParser.parseChildText(["facilities", "toilet"])(xml: xml) >>- toiletCountParser
The Animal
and Zoo
Models both implement the XMLDecoderType
protocol. In this case there are a number of Animal
Models belonging to a Zoo
, so the decode
functions can be reused to extract out each of the Animal
Models contained in a parent Zoo
:
let animals = XMLParser.parseChild("animals")(xml: xml) >>- XMLParser.parseChildren("animal") >>- resultMap(Animal.decode)
There is another function that takes the output of XMLParser.parseChildren
as an input, of type Result<[X]>
. The resultMap
function is like a regular map
on an Array
, except with the return of a Result.Error
if any of the applications of the map
function fails:
public func resultMap<A, B>(map: A -> Result<B>)(array: [A]) -> Result<[B]>
Applicatives
Now we have everything we need to extract values out of XMLParsableType
and into Result
containers for each of the values that need to be extracted. Now the Result
values need to be chained in the following way:
“If all of the
Result
s corresponding to each of the Values are Successful, return our Model with the Values from the Result context applied to the Model Constructor function, otherwise just return the first Errored Result.”
Let’s turn back to Curried Functions, this time to a Curried ‘Constructor’8:
static func build(kind: String)(name: String)(url: NSURL) -> Animal {
return self(kind: kind, name: name, url: url)
}
By adding a little more context to each of the applications of Higher-Order functions, we should be able to follow what is going on as the Animal.build
function down to a Result<Animal>
:
// These are previously defined using Higher-Order functions and XMLParser
let kind: Result<String> = ...
let name: Result<String> = ...
let url: Result<NSURL> = ...
// The Build Function is a Curried function that will yeild functions with every application until we have the Animal value.
let build: String -> String -> NSURL -> Animal = self.build
// Each of these stage gets the Animal structure closer to being initialized
let first: Result<String -> NSURL -> Animal> = self.build <^> kind
let second: Result<NSURL -> Animal> = first <*> name
let third: Result<Animal> = second <*> url
This feels a lot like solving a Mathmatical equation by reducing the variables to balance each side of the equation. There are a few more Higer-Order functions that are being used here, doing the heavy lifting of pulling values and functions out of a Result
context executing a function and then sticking the return value back in a Result
again.
Firstly, the fmap
operator:
public func <^><A, B>(f: A -> B, a: Result<A>) -> Result<B>
It can be thought of in the following way:
“If the Result a is an Value apply the function f and place the result back in a Result, if the Result a is an Error, just return the Result immediately”
The property of this operator is that the applied function f
doesn’t have to ever have to be aware that the applied argument has been part of a Result
context.
Secondly, we can take a look the apply
operator:
public func <*><A, B>(f: Result<A -> B>, a: Result<A>) -> Result<B>
It too can be described:
“If the Result f is an Error just return it immediately, if the Result a is an Error just return it immediately. Otherwise pull the function out of f and the value out of a, apply a to f then place the returned value in a Result”
Again, these functions themseves no longer need to be aware that values are contained in a Result
context.
Between <^>
, <*>
and >>-
all the combinations of applying Values to Functions inside and outside of a Result
are covered. These Operators allow us to use pull values and functions into-and-out-of Result
contexts, using functions that don’t have to be aware of the fact that the values and Functions may have come out of a Result
9.
Teach the Controversy
There’s a lot to take in, some of the benefits should be clear, others are a bit more subtle. I don’t deny some of this doesn’t have a steep learning curve, it may go against many years of experience with languages that don’t have semantics for dealing with Functions as things that can be combined in these ways. All of these concepts and implementations are nothing new and are the products of high-shouldered giants. With the modernity of Swift we have the opportunity to incorporate more Functional Programming into Native iOS and Mac Developments. Given that there are certain programming techniques that can’t be carried over from Objective-C, its worth considering incorporating some Functional Programming if it fits the task at hand.
There’s a lot of concern that Operator Overloading is going to lead to a lot of Smart People doing some very silly things10 for the purpose of making dense, concise and far-too-clever code. The reality of these operators is that they are small in number and aren’t specific to just Optional
and Result
. As Arithmetic Operators deal with Numeric values, Functional Operators operators deal with the transformation of Functions themselves. These concepts are taken from other languages, so they aren’t at all specific to Swift.
There’s also some new terminologies and another way of describing the interaction of components, this isn’t less true of a Domain Specific Language that we can graft on top of Objective-C[^objc-dsls] or with a more flexible language. Sure, Objective-C has the benefit of a preprocessor to graft new features onto the language to and reduce the amount of boiler-plate required11, but these can become pretty opaque over time.
However, unlike a Domain Specific Language, all of the code used is valid Swift, there is no transformation from a language in one domain, to a syntax that the compiler understands. The Compiler is fully aware of the Types of all of the transformed Functions leaving no room for ambiguity. The chances of a runtime error because of a keypath not existing or the type being different to the one expected by the code are greatly diminished.
I honestly believe its worth taking a jump at the conceptual hurdle, whether it is for writing code that is more robust and predictable or to satisfy a curiosity of learning something new.
Thanks for Reading! I suggest reading some other brilliant posts that do a better job than I of laying this all out! As a bonus, the next post will focus on how XMLParsableType
can be implemented in a variety of ways.
-
Swiftz has separated its functionality across a core library and the library proper. This post will only use the Core library.↩
-
I’m lying, I’ve changed the Generic Parameters from
VA
&VB
toA
&B
↩ -
They consume a Protocol, abstracting away how the XML Parser is implemented. As the Protocol has an associated type requirement it has to be fulfilled with Generics. Composed behaviours don’t pollute the Implementation of the Protocol but still augment the behaviour.↩
-
When starting out it can be really helpful to do this, it makes inspecting the types of each of the elements in the chain more visible. You can use Alt+Click on the value name to get XCode to print out the inferred type. Its also a good illustration of the power of type inference.↩
-
Swift makes strong guarantees about the existance of values, there’s no need to check for the existance of values in arguments that are non-optional. In an Objective-C implementation this contract can be enforced with a litany of assertions. If the language and compiler can enforce these guarantees we’ve got a huge productivity win on our hands.↩
-
As the functions consume a Protocol with associated type requirements the Protocol type has to be defined as a Generic type. The functions are Pure in that they do not consume or modify any Global State, only the arguments are used. The guarantee of the no side-effects cannot be enforced in Swift. In essence the class is a bunch of similar functions that are kind-of-namespaced as Class methods of a
final
class with no constructor.↩ -
And it could be damaging to the codebase. Cocoa Classes could easily become polluted with
Result
variants of every possible method that could fail and this would continue over time as more classes are required to be parsed. More importantly it is not possible for a GeneralResult
returning type to have some of the crucial context surrounding a failure. We want ourNSError
s to contain the Element within the XML responsible for the failure, an Extension method would lose this context. ‘Failed to interpret ‘toilet’ as an Int’ is preferable to ‘Failed to interpret ‘123a’ as an Int’↩ -
‘Factory Method’ is probably a better term for this since a ‘Constructor’ is a term reserved to
init
methods in Swift. ↩ -
Optional
andResult
are examples of Functors and Monads. This fantastic article covers the operators in a visual way. There’s more of these (Monads) than you might first think and its a crucial part of many functional languages.↩ -
This happens when any ‘new’ language feature comes along. I know I abused blocks when they appeared in Objective-C.↩
-
In Objective-C this can be handled with Macros and Early Return, but we can’t rewrite/mangle the rules of the language in Swift as we don’t have Macros.↩