2014-02-10

The Objective-C collections classes from Foundation, NSArray, NSSet and NSOrderedSet, already support various forms of functional-style programming.

First, and probably oldest, is:

[NSArray makeObjectPerformSelector:]

Key-Value Coding has:

[NSArray valueForKey:];

and of course, there’s the block-based:

[NSArray enumerateObjectsUsingBlock:]

Similar methods are available for NSSet and NSOrderedSet.

However, I feel there’s room for improvement. I spent a few hours last weekend experimenting with different techniques and API styles for Collections Functional Programming.

Here’s what I want:

  • A consistent API for NSArray, NSSet and NSOrderedSet,
  • The basic higher-order functions:
    • Map (a.k.a. collect)
    • Filter (a.k.a. select)
    • Fold/reduce could be added easily, but I never see the need for it.1

I’ll implement four different approaches, using various Objective-C coding styles:

Style #1 - Block-based enumeration

That’s the most modern and flexible style, but not the most clear or concise. It’s also the one with the most existing support in Foundation, with the base methods available on NSArray, NSSet and NSOrderedSet.

- (void)enumerateObjectsUsingBlock:(void (^)(id obj, NSUInteger idx, BOOL *stop));
- (NSIndexSet *)indexesOfObjectsPassingTest:(BOOL (^)(id obj, NSUInteger idx, BOOL *stop));

This isn’t exactly the API I want, so I’ll add two other methods:

- (instancetype) map:(id(^)(id obj))block;
- (instancetype) filteredCollectionWithTest:(BOOL(^)(id obj))test;

map: and filteredCollectionWithTest: do exactly what they say, and return a collection of the same type as the original receiver: an NSArray for an NSArray, and so on.

Here’s what it looks like in a real world example2:

id heroes = @[batman, catwoman];

id (^getIndentity) = ^id(id hero) {
    return hero.identity; 
};
id identities = [heroes map:getIndentity];
// -> @[@"Bruce Wayne", @"Selina Kyle"];

BOOL (^isBruce)(id) = ^BOOL(id hero) {
    return [hero.identity hasPrefix:@"Bruce Wayne"];
}
id batmen = [heroes filteredCollectionWithTest:isBruce];
// -> @[batman]

Easy, right?

Style #2 - Key-value coding

Like with blocks, there’s already good support in the frameworks for KVC on collections. KVC has a lot o very cool features, including Collections Operators. 3

Map is trivially done using valueForKey::

id identities = [heroes valueForKey:@"identity"];
// -> @[@"Bruce Wayne", @"Selina Kyle"];

Filter requires some work, but it’s also very easy to write:

- (instancetype) filteredCollectionWithValue:(id)value
                                  forKeyPath:(NSString*)key;

Then, back to our real-world example:

batmen = [heroes filteredCollectionWithValue:@"Bruce Wayne"
                                      forKey:@"identity"]; 
// -> @[batman]

Wait a minute! I just cheated. There’s no easy way to do custom algorithms using KVC like the hasPrefix: match we’ve done with blocks.

Until we can write custom collection operators, the workaround is to add the test method directly to the items’ class:

- (BOOL) isBruce {
    return [self.identity hasPrefix:@"Bruce Wayne"];
}

and use it like this:

id batmen = [heroes filteredCollectionWithValue:@YES
                                         forKey:@"isBruce"]; 
// -> @[batman]

It kinda works, but it’s way less flexible that blocks.

On the other hand, it’s completely type-unsafe, and Apple doesn’t seem to care a lot about it: the KVO APIs could use some love, and NSOrderedSet doesn’t support Collection Operators at all.

So: mixed feelings about KVC.

Style #3 - NSInvocation-based enumeration

Moving on, our next coding style is an old lady of Objective-C development, NSInvocation. Usually, when you see NSInvocation, <objc/runtime.h> isn’t very far.

An NSInvocation is conceptually very similar to a block: it records a specific code invocation. But where a block can contain abitrary code, NSInvocation represents the invocation of a message and its parameters on a target.

NSInvocations are rather tedious to create and manipulate. They also have special memory management rules, so handle with care.

Anyway, here’s what the map/filter API looks like with invocations:

- (instancetype) map:(NSInvocation*)invocation;
- (instancetype) filteredCollectionWithTest:(NSInvocation*)invocation;

This is very similar from the block-based API, except we’re now passing NSInvocations instead of blocks.

NSMethodSignature * signature =
    [batman methodSignatureForSelector:@selector(identity)];
NSInvocation * getIdentity =
    [NSInvocation invocationWithMethodSignature:signature];
getIdentity.selector = @selector(identity);

id identities = [heroes map:getIdentity];
// -> @[@"Bruce Wayne", @"Selina Kyle"];

Not that bad. Things get a little more complex when using methods with arguments:

NSMethodSignature * signature =
    [batman methodSignatureForSelector:@selector(identityHasPrefix:)];
NSInvocation * isBruce =
    [NSInvocation invocationWithMethodSignature:signature];
isBruce.selector = @selector(identityHasPrefix:);
NSString * searchedName = @"Bruce";
[isBruce setArgument:&searchedName atIndex:2];

id batmen = [heroes filteredCollectionWithTest:isBruce];
// -> @[batman]

There again, I had to cheat a little, and add a identityHasPrefix: method to batman’s class.

It’s a bit less flexible than blocks, of course, but it’s actually a good thing: identityHasPrefix: can be unit-tested, while the (anonymous) block cannot.

One last thing about invocations: with the blocks, I was able to specify the return type of the blocks directly in the method declarations. Map blocks return id, while Filter blocks return a BOOL. As far as the API is concerned, NSInvocations are just objects, there’s no way to indicate what their result type should be.

Of course, NSInvocations are almost never created by hand, because that’s the job of the runtime. This leads to…

Style #4 - Higher-Order Messaging

NSObject (and NSProxy) have a method called forwardInvocation:. It’s called when the receiver doesn’t know what to do with a message.

We’re going to use it to redirect a single invocation to each of the collection’s items.

To do this, we’ll use4 a trampoline object that will receive an arbitrary message, and forward it to the items of the collection, using the NSInvocation API described above.

id identities = [[heroes map] identity];
// -> @[@"Bruce Wayne", @"Selina Kyle"];

Easy, right? map returns a proxy that receives the identity message, forwards it to each object of the collection, and returns the results in a new collection.

There’s a small catch: for the compiler, identity returns an NSString*, but our trampoline actually returns a collection (NSArray, NSSet or NSOrderedSet). We can workaround it by simply casting to id, but this is going to be a little more complex for filter.

The problem is that identityHasPrefix: returns a BOOL:

id batmen = [[heroes filter] identityHasPrefix:@"Bruce"];
// implicit conversion of 'BOOL' to 'id' is disallowed with ARC

This isn’t going to work. Alternatively, we can pass a return pointer as a parameter:

id result;
[[heroes filteredCollectionInto:&result] identityHasPrefix:@"Bruce"];

But in fact, the best solution is to simply mutate the collection itself:

heroes = [heroes mutableCopy];
[[heroes filter] identityHasPrefix:@"Bruce"];
// -> heroes is now @[batman]

Beautiful.

So what

I’ll confess, the whole goal of this post was to write a Higher-Order Messaging API for the Foundation collections. While blocks are clearly more powerful, message-passing and invocations feel more natural to me, more native to the language. I usually avoid language-changing hacks, but I may make an exception here.

There’s a small repo on github with the various implementations for NSArray, NSSet and NSOrderedSet.

Feel free to fork and hack. Comments are welcome, either on github or via Twitter


Bonus: How to Implement a Method Once For Several Classes

As I said at the beginning of this post, I want a consistent API for all the Collection classes. To achieve this, I declared the various methods in several protocols:

@protocol BlockCollecting
- (instancetype) block_map:(id(^)(id obj))block;
- (instancetype) block_filteredCollectionWithTest:(BOOL(^)(id obj))test;
- (id) block_oneObjectPassingTest:(BOOL(^)(id obj))block; // If the collection is ordered, returns the first matching object
@end
[...]

I later state that these protocols are adopted by all the Collection classes:

@interface NSArray      (Collecting) <BlockCollecting, KVCCollecting, InvocationCollecting, HOMCollecting> @end
@interface NSSet        (Collecting) <BlockCollecting, KVCCollecting, InvocationCollecting, HOMCollecting> @end
@interface NSOrderedSet (Collecting) <BlockCollecting, KVCCollecting, InvocationCollecting, HOMCollecting> @end

However, the actual implementation doesn’t follow that pattern. Instead, the methods are implemented on NSObject, which is tweaked just enough to make the compiler believe it’s working.

@implementation NSObject (BlockCollecting)
- (instancetype) block_map:(id(^)(id obj))block
{
    id values = [self col_emptyMutableContainer];
    for (id object in self) {
        id value = block(object);
        if (value) {
            [values addObject:value];
        }
    }
    return [values col_immutableCopy];
}
[...]
@end

This way, each method has to be implemented only once, for three unrelated classes.5

  1. I fail to find real-world code patterns that would benefit from reduce. If you ever used it beside the obvious “sum of numbers” and “string concatenation” samples, please do tell me. I’ll update the post. 

  2. My sample data may not be accurate. 

  3. I already mentioned KVC in previous posts

  4. Once again

  5. Yes, I prefix methods of categories of foundation classes.