Community, Random

Scratching a 7-Year Itch

Today marks fszmq’s seventh birthday. Or maybe call it an anniversary? Very technically, the library’s a few weeks older than this. But seven years ago — on 2 May 2011 — I put the source code on GitHub.com, which officially introduced the project to the world. In observance of this milestone, I thought I’d talk a little about its origins, what I’ve learned over the years, and speculate a bit on what the future might hold for fszmq.


What, exactly, is fszmq?

fszmq is an MPLv2-licensed CLR (e.g. Mono, .NET) binding to the ZeroMQ distributed computing library. ZeroMQ, meanwhile, provides a wide range of building blocks for developing high-performance, message-passing systems.


Waaaaay Back When (in 2011)

To set the stage, let’s note that seven years ago I was in NYC, doing front-office development for a hedge fund. Ostensibly, I was building a medium-volume, gray-box trading system. But practically, I spent my time juggling loads of little tools needed for various business functions. And it had become painfully clear that we needed a better way of getting various services to coordinate. HTTP had too much overhead. WCF was too… well, WCF — with all the over-engineered pains one might expect. So we looked at various pieces of middle-ware. The big servers (MQSeries, TIBCO, et cetera) were well out of scope, due to time and budget constraints. The more approachable products (MSMQ, NServiceBus, et cetera) raised more than few performance and maintenance concerns. Then, one day, a colleague of mine mentioned a buddy of his (at another hedge fund) raved about a new tool they were using for high-frequency trading systems: ZeroMQ. In need of a solution, I took ZeroMQ for a test drive. I was very impressed by the speed, the flexibility and relative simplicity of the API, and the superb support. I was sold on a light-weight, decentralized messaging library. The only catch? I was not enjoying a return to C/C++ code. Fortunately, I had enough use-cases, with sufficient latency/throughout tolerance, that a managed wrapper was feasible. In fact, there was even a nascent version of a (now defunct) C# wrapper for libzmq (the native C library which serves at the canonical implementation of the ZMT protocol). However, the wrapper was buggy. And it had a fairly clumsy, poorly documented API. And was not receiving much support. So, rather than run back to C/C++, I decided to build my own wrapper. And because I was so taken by F# (at that point, I’d been using it for 4 years — 2 casually and then 2 professionally), I decided to put its P/Invoke support to the test. My initial results were solid. So, I put the work out on GitHub. Thus, fs-zmq (note the hyphen — now a long-forgotten vestige) came into existence, born out of equal parts need and curiosity.

Fast forward seven years. I think the library has held up fairly well. Certainly, it’s seen regular production use in a few different companies, both in and out of the financial sector. It’s also moved from my personal project to being part of the ZeroMQ organization. And it has received love and attention from nearly a dozen contributors over the years.

fszmq Milestones
Major events in the history of fszmq

The Good Parts of fszmq

But I want to take this opportunity to focus on the API. Growing fszmq has taught me a lot about the craft of library-as-interface. It has been especially instructive regarding the mixing of languages, both in terms of working with C from F# and in terms of providing a friendly experience to C# (or VB) consumers. As I’ve written about the P/Invoke “gotchas” elsewhere, I’d now like to talk about the aspects of the managed API which I think were rather successful.
It terms of a model, fszmq is primarily three types: Context, Socket, and Message. Each of these loosely follows the same pattern. A core type (Socket, et cetera) is primarily concerned with managing native memory safely and efficiently. I say “primarily” because the Context type is a little bit… well, special. In addition to managing its own native memory, a Context instance also tracks Socket instances so it may participate in program termination. This is unfortunately needed to get clean shutdown semantics (and is a long-standing pain point for wrappers of libzmq). Socket and Message instances, however, truly are only focused on allocating and releasing their own native memory. Meanwhile, the operations which define the meat of fszmq are all stateless functions which take an instance of a core type, e.g. Message (which is treated as an opaque token). This separation of data from behavior leads to a few interesting consequences. It aligns very closely to the underlying C API, reducing cognitive load. It also means adding individual behaviors is more isolated (and thus, testable). It also helps to increase the composability of certain workflows. If you think this sounds vaguely like the “abstract data type” pattern, you’re not wrong. The overall design is inspired by it. Although, it may be argued that fszmq deviates in some ways. At any rate, I’ve previously written a more detailed review (in case you’re interested).

There are two other aspects of the API which I feel are worth noting: cross-language support and documentation. There is an explicit effort throughout the code base to present a friendly and usable interface to non-F# consumers. I’ve written more generally about F# API design elsewhere. But, for fszmq this means several things:

  • Operational functions often serve “double duty” as extension methods, so as to appear like instance methods in C#.
  • In some places, additional overloads are given to convert between F#-centric types and more common BCL types.
  • CompiledNameAttribute is applied — judiciously — to help ensure discoverability when callers navigate the API.
  • An assembly-level ExtensionAttribute is emitted so that VB will properly detect extension methods, offering consumers in that language another possible means of expression.

Finally, I’ve always been impressed with the amount of genuinely useful documentation produced for libzmq. And I’m pleased that fszmq has tried to achieve something similar. Originally, this meant contributing to the ZGuide, an in-depth multi-language tour of ZeroMQ. However, in recent years, it has come to mean having comprehensive API documentation and instructional examples hosted along-side the actual fszmq repository.

Recent statistics on fszmq

The Not-So-Good Parts of fszmq

Of course, all libraries have warts, and fszmq is no exception. Seven years on, the API is showing its age in a few places. Additionally, there are a few things I’d do rather differently (if only I knew then what I know now). What follows is a partial airing of grievances, in rough order of how badly things irk me. Hopefully, not all of them have calcified to the point of permanence (but most of them have).

I should’ve (somehow) automated the code for getting and setting options on the core types — especially on Socket instances. It’s not particularly complicated code. But it’s tedious. And requires tweaking every time libzmq adds or deprecates an option (which is nearly every release). Ideally, I’d handle it with a type provider. But barring that (since they didn’t exist when I first started fszmq), I should’ve made code generation part of the build process (T4, GSL, et cetera).

Errors are a normal part of software development. But fszmq could handle them a lot better. There are several places where monadic error handling (i.e. railway-oriented programming) would enable much simpler and more composable workflows. There are also some places where errors really ought to be flat out swallowed. In these cases, it’s entire possible to anticipate and adapt. Finally, even in the places where raising an exception is the best course of action, fszmq could do better. Most times the underlying (native) error is just “passed through”. Encoding these errors into meaningful sub-classes of System.Exception (with any relevant meta-data) would provide a much better experience for consumers.

The polling API is too cumbersome. Polling is an essential technique in real-world applications. Yet the fszmq API — despite being reactive — feels very clunky. It requires complex mutable state management and closures in all but the simplest of scenarios. Now, admittedly, at least receiving scenarios can use tryPollInput (or TryGetInput, depending on the programming language). But even there you have a bit of overhead and no way to tune Message instance allocations.

Finally, how many custom operators is too many? In the case of fszmq — all of them! There’s no need for any of them. They just confuse things and add more surface area to maintain.


The Best Part of fszmq

Over the years I’ve come to appreciate a core strength of ZeroMQ is its tremendous community. And that certainly includes the following contributors who each had a hand in shaping fszmq. They forever have my respect and gratitude. Thanks!


Looking beyond 2017

So what’s next for fszmq? Well, there’s the usual tinkering (bug patching, keeping up with changes to libzmq, et cetera). But I’ve also begun producing content for an on-line video-based training course, which aims to get C# developers quickly building distributed solutions with ZeroMQ. There are also definite plans to further extend the documentation based on some community feedback. And I’ve begun work on adding a limited amount of integration between fszmq and F#’s async workflows. Also, some early-stage (mostly experimental) work was recently begun to make fszmq run on .NET Core. But really, I’m hoping to grow the contributors to fszmq. So why not check out the issue tracker and hack something up? Or at the very least, leave a suggestion for anything you think needs to be added or addressed in the library. I want to here from you!

Anyway, thanks for reading. And here’s to another seven years!


In Memoriam: Pieter Hintjens (3 December 1962 – 4 October 2016)

Nothing I’ve done with fszmq would’ve been possible without the vision and encouragement of Pieter. I’m honored to have collaborated (however briefly) with him. And no one I’ve met in my 18-year career has had a greater impact on my work — or my world view. I consider him a teacher. Hopefully, I have been and continue to be a worthy student.


 

Advertisements
Random, Source Code

Tips & Tricks to Improve Your F# Library’s Public API

This post is part of the 2016 F# Advent Calendar.

With the addition of F# in 2010, the .NET run-time gained a terrific, production-grade, functional-first programming language. This post is aimed at anyone who uses that terrific language to develop .NET libraries.

Because, you see, there’s a bit of a problem we need to discuss.

One of the popularized strengths of F# is its ability to both consume and be consumed by other .NET languages. And yet, if not done carefully, consuming F# code from, for instance, Visual Basic can be an absolute nightmare. Fortunately, over the years, I’ve settled upon a few different techniques that can mitigate, and in some cases even obliterate, any unpleasantness. Of course, this article assumes you, as a library author, actually want other developers to have a pleasant experience consuming your work (regardless of the .NET language they employ). If that’s not how you feel, no worries — but you can stop reading now. The rest of this post really won’t appeal to you.

What follows are 12 guidelines, listed in descending order of priority, which each address one, or more, potentially awkward points of integration between F# and other .NET languages. It’s important to note: this advice is intended for the public API of your libraries. By all means, do whatever awesome, clever things you’d like internally. This list is just to help give it a pretty face.

1. Do limit the use of advanced generic constraints

F# supports a wider range of possible generic type constraints than either C# or VB. Not matter how useful, or cool, a constraint might seem (sorry, SRTP fans), it’s meaningless if a consumer can’t possibly comply with it. To that end, public APIs should only leverage the following types of constraints:

Constraint Syntax
Subtype type-parameter :> type
Constructor type-parameter : ( new : unit -> 'a )
Value Type type-parameter : struct
Reference Type type-parameter : not struct

2. Do expose TPL rather than Async workflows

Asynchronous programming in F#, via Async workflows, is a simply unintelligible mess in other languages. Fortunately, it’s quite easy to integrate System.Threading.Tasks.Task into a workflow. Tasks received as input can be sequenced via Async.AwaitTask:

let runAll tasks = 
  let rec loop agg work = async {
    match work with 
    | [  ] -> return List.rev agg
    | h::t -> let! v = Async.AwaitTask h
              let  r = v * 2
              return! loop (r::agg) t } 
  tasks |> loop []

Meanwhile, if you need to return an async workflow to a non-F# caller, you can leverage Async.StartAsTask:

let getUserPrefs conn uid =  
  async { use db = new DbUsers() 
          let! prefs = db.ExecuteAsync(conn,uid) 
          return marshal prefs }
  |> Async.StartAsTask

3. Prefer BCL collection types

F# ships with a small number of persistent functional collections. These are the bread and butter of functional programming. But they’re cumbersome and confusing in other languages. So, for input parameters and return values, consider converting to or from common collection types. For example, when working with List or Set:

(* NOT RECOMMENDED *)
let transform (value :int list) =
  // do stuff with values...

(* DO THIS INSTEAD *)
let transform (values :int seq) =
  let values' = Seq.toList values
  // do stuff with values' ...

Similarly, when working with Map:

(* NOT RECOMMENDED *)
let transform (table :Map<string,int>) =
  // do stuff with table ...

(* DO THIS INSTEAD *)
let transform (table :IDictionary<string,int>) =
  let table' = 
    table 
    |> Seq.map (function KeyValue (k,v) -> (k,v)) 
    |> Map
  // do stuff with table' ...

Note, in the previous samples, type signatures are for demonstration purposes only. Also, note that similar conversions (e.g. Set.toSeq, et cetera) can, and should, be used for return values.

4. Do provide conversions from standard delegates to F# functions

Owning to a number of very good technical and historical reasons, F# uses a different first-class function mechanism than other languages. And, while the F# compiler makes it pretty easy to turn (fun x -> x * 2) into a Func, the inverse is not so easy. So, it becomes important to provide some means of supporting the standard BCL delegates Func and Action (which is what C# and VB use for their first-class functions). This can take several different formats. For instance, we can give the caller the ability to handle converting from a common delegate to an F# function. If we define a utility like:

[<Qualified>]
type Fun =
  static member Of(act :Action<'T>) = (fun t -> act.Invoke t)

Then a VB consumer might use:

Dim opt = BizLogic.ImportantCalc(42)
If Option.IsSome(opt) Then
  Option.Iterate(Fun.Of(PrintOption), opt)

However, often I will provide an extension method which handles the conversion internally, saving the consumer a bit of work. For example, a method like:

  [<Extension>]
  static member IfSome(option, act :Action<'T>) =
    option |> Option.iter (Fun.Of withSome)

Would turn the previous consumer example into something a bit simpler:

Dim opt = BizLogic.ImportantCalc(42)
opt.IfSome(PrintOption)

5. Do emulate matching with higher-order functions

While C# and VB do not support the rich pattern matching enjoyed in F#, we can still leverage higher-order functions to approximate an expression-oriented API. This technique is especially effective with discriminated unions, as seen here:

[<Extension>]
static member Match(option, withSome :Func<'T,'R>, withNone :Func<'R>) =
  match option with
  | Some value  -> (Fun.Of withSome) value
  | None        -> (Fun.Of withNone) ()

Given the above definition, consuming C# code might look like this:

return ValidationSvc.Validate(input).Match(
  withSome: v  => v.ToString(),
  withNone: () => "<NOT SET>"
);

6. Prefer overloaded methods to optional parameters

One of the recurring themes of this post is: F# does things differently. And optional parameters are no exception. In F# they are based on the Option type. However, in other languages they are handled via dedicated syntax. Due to this mismatch, F# “optional” arguments are, in fact, required in a call from C# or VB. In fairness, you can achieve the non-F# behavior in F# by careful use of attributes. However, I generally find it easier to explicitly define overloads, which vary in the number of parameters, and thus give callers the effect of having optional arguments.

ASIDE: a more in-depth look at this topic, by Mauricio Scheffer, may be found here.

7. Do remember to properly export extension methods

I can’t stress this one enough. In order to properly comply with published specifications — and to support Visual Basic.NET even recognizing an extension method — F# code should decorate each method being provided with [<Extension>]. Additionally, any class which contains any such decorated methods should also be decorated with [<Extension>]. Finally, somewhere — anywhere — in a library which provides extension methods, you need to add an assembly-level attribute of the form [<Extension>].

ASIDE: for a more detailed explanation of this mechanism please see this blog post, by Lincoln Atkinson.

8. Do use classes to provide extension methods

F# actually offers two different ways to define extension methods. The first approach is to decorate a module and one, or more, functions inside that module.

[<Extension>]
module Socket =
  [<Extension>]
  let Send (socket :Socket) (frame :byte[]) =
    // ...

  [<Extension>]
  let SendAll (socket :Socket) (frame :byte[][]) =
    // ...

But, as an alternative, you can decorate a class and one, or more, static methods defined on that class. This second approach, besides more closely mirroring the consumer’s mental model, offers a slight advantage: it greatly simplifies the process of providing method overloads. You simply list the separate methods, and implement them as normal.

[<Extension>]
type Socket =
  [<Extension>]
  static member Send(socket :Socket, frame :byte[]) =
    // ...

  [<Extension>]
  static member Send(socket :Socket, frames :byte[][]) =
    // ...

With the first approach, because each function needs a unique name within the module, you must leverage the CompiledNameAttribute to “fake out” method overloads (note: see the next tip for more details).

9. Do make judicious use of CompiledNameAttribute

The CompiledNameAttribute, like much of F#, is a double-edged sword. It can be used to great effect, especially when trying to support C# or VB. But it can also lead to ruin, increasing complexity and confusion for no real benefit. Use with caution. The concern, you see, is that by using this attribute, you cause a construct to have two different names. One will be used by/visible to F#. While the other will be used by/visible to other languages and reflective meta-programming. However, this is sometimes exactly what’s needed. For example, while it often makes sense to collect all of your “language interop” extension methods into a single, dedicated class. For very simple functions, requiring no additional manipulation, it may make sense to avoid the extra layer of indirection. For example, this code:

[<Extension; CompiledName "Send">]
let sendAll (socket :Socket) (msg :byte[][]) =
  // ...

Would be consumed from F# as:

msg |> Socket.sendAll pub

And equally from VB as:

pub.Send(msg)

Another time where CompiledNameAttribute can be helpful in sorting out naming conflicts is when types and modules need to have similar names:

type Result<'T,'Error>

[<Qualified; Module(Suffix)>]
module Result =
  // ...

[<Extension; CompiledName "Result")>]
type ResultExtensions =
  // ...

As this example demonstrates, we can partition the functionality around an abstract data type. We can put all the F#-facing components into a module. Then provide the C#-facing equivalents in a class for of static methods. Adding CompiledName to the mix ensures a clean, per-language experince.

In F#:

// invoking a function on the Result module
Result.tryCatch (fun () -> getUserFromDb conn) 

And in C#:

// invoking a static method on the ResultExtensions class
Result.TryCatch(() => { return getUserFromDb(conn); });

10. Prefer records or unions over tuples or primitives

Folks might shy away from exposing F#-centric types like records or discriminated unions to other languages. However, especially if heading the previous guidelines, there’s no reason not to share these powerful constructs with C# and VB. In particular, using types like Option (rather than “null checking” or using a Nullable) can greatly improve the overall robustness of an API. Consider this example:

let postRequest body :bool =
  // ...

It can only tell a consumer “Yay!” or “Nay!”. Meanwhile, this example gives calls a much richer experience:

let postRequest body :Result<bool,exn> =
  // ...

ASIDE: for more details about this approach to error handling, please see the excellent work presented here, by Scott Wlaschin.

11. Do expose a LInQ provider (if appropriate)

While not appropriate for all domains, exposing F# functionality via Language-Integrated Query (hereafter, LInQ) can be an excellent bit of “sugar” for C# and VB. Consider this example (leveraging techniques discussed earlier), which tries to combine multiple Result instances:

var vsn = major.Match(
    m => minor.Match(
        n => revision.Match(
            r => string.Format($"{m}.{n}.{r}"),
            e => e.Message),
        e => e.Message),
    e => e.Message);

WriteLine(vsn);

Now look at the improvements LInQ offers:

var vsn = from m in major
          from n in minor
          from r in revision
          select string.Format($"{m}.{n}.{r}");

vsn.IfOK(WriteLine); 

It’s true that providing LInQ support will mean defining 3, or more, extension methods against a type. But it’s usually worth the effort.

12. Do NOT rely on type aliases

Type aliases can dramatically improve the intensionality of your F# code. Unfortunately, they are fully erased at compile time (i.e. stored in special metadata only read by other F# libraries). So a C# or VB consumer won’t see your well-intended, self-documenting type alias… just the raw type underneath.

Community, Presentations

The Month NYC Ran Out of Excuses Not to Learn F#

It’s Official! I’m totally declaring September 2013 F# Month (in New York City, at least). I mean, here we are — not seven days into the month yet — and I’ve got a veritable cornucopia (note the reference to impending autumn) of events for you to attend. For starters, on Tuesday 10 September, I’ll be in Parsippany, NJ, at the Northern New Jersey .NET Users’ Group, for an introduction to F#. Then, a few days later, on Saturday 14 September, it’s back to mid-town for Code Camp NYC 2013. This day-long event features several F# talks (one of which will be given by yours truly). Pushing into the following week, on Tuesday 17 September, the inimitable Rachel Reese will present to the NYC F# Users’ Group on actor-based concurrency in F# (also in mid-town). Next up: Wednesday 18 September and Thursday 19 September see the return of Skill Matter’s Progressive F# Tutorials NYC, in Brooklyn’s lovely DUMBO neighborhood. This intense two-day hands-on learning-fest is more than a mere conference. It’s chok-full of in-depth education, taught by many of the foremost minds in the F# community (minds like Don Syme, Tomas Petricek, Phil Trelford, and Rick Minerich … to name but a few). Then, as an added bonus, as if this wasn’t enough, Vermont Code Camp 5 happens, on Saturday 21 September, in beautiful Burlington, VT. It’s not too far from New York, and features a couple of F# talks (including — you guessed it — one of mine). So, that’s five events, spread over two weeks, offering more F#-y goodness than you thought possible. Many of these events are free. Most still have tickets available. All of them will be awesome. So, if you’ve had even a passing interest in F#, you’ve just run out of excuses. See you there!

(Note: Did I miss something? Leave it in the comments and I’ll update this post as necessary. Thanks!)

Source Code

To the A-maze-ment of All: Generating Mazes in F#

Little more than a week ago, I had the great pleasure of attending the inaugural Lambda Jam conference in Chicago, Illinois. It was a terrific experience, and highly recommended for anyone interested in functional programming. One of the more interesting aspects of the conference was the daily “Jam Sessions”. Part hack-a-thon, part coding contest, each jam saw groups of participants trying to solve some interesting challenges in a variety of functionally-oriented languages. And, it just so happens, I was a “Jam Mentor”, supervising and facilitating a jam centered around mazes. So, now that it’s all said and done, I’d like to present my take on the problem of maze generation (in F#, of course).


The problem statement (as given to jam participants):

The goal is to produce a two-dimensional rectangular maze of arbitrary size. The actual dimensions of the maze should be accepted parametrically (ex: inputs to a function, command-line arguments, et cetera). The maze itself should be output… as an array of arrays of integers, where each inner array represents one row of the maze and the number in each cell is a bitmask indicating the number of passages leading out of the cell (Note: the bitmask pattern is: North = 1, South = 2, East = 4, West = 8).

So, it seems the very first thing we’ll need (since I tend to be a “visual” thinker) is a way to turn the maze description into something graphical. Since the format is an array of arrays, it’s very convenient to treat the maze as a series of cells. The greatly reduced issue, then, is: how to draw a single cell? Well, given a value for the width and height (assuming square cells, cellSize in the actual function), we only need an origin point (x and y) and the actual value of the cell (cell). These numbers, together with a little GDI+, gives us this:

let drawCell (x,y) cell =
  let x1,y1 = cellSize * x,cellSize * y
  let x2,y2 = x1 + cellSize,y1 + cellSize

  if cell |> hasWall N then graphics.DrawLine(Pens.Black,x1,y1,x2,y1)
  if cell |> hasWall S then graphics.DrawLine(Pens.Black,x1,y2,x2,y2)
  if cell |> hasWall E then graphics.DrawLine(Pens.Black,x2,y1,x2,y2)
  if cell |> hasWall W then graphics.DrawLine(Pens.Black,x1,y1,x1,y2)

We first calculate the top-left and bottom-right corners of the cell (x1,y1 and x2,y2, respectively). Then, if the cell has a wall on a particular side, we use the calculated points to draw a line representing said wall. This will let us take data like:

var maze = [
  [ 2, 14, 10, 14,  8 ],
  [ 5,  9, 11, 13, 11 ],
  [ 3, 15,  9, 15,  9 ],
  [ 7, 15, 13, 15, 11 ],
  [ 1, 13,  9,  9,  9 ]
]

and turn it into:

A Simple Maze
A Simple Maze

Great. So now that we can visualize, it’s time for some random maze generation.

At Lambda Jam, participants were given two algorithms to try (both being well-documented on the web). One, called Growing Tree, is stack-based and “carves” passages through the maze. Again, quoting from the materials given to jam participants:

Growing Tree

This passage-carving algorithm maintains a list of “carved” cells as it traverses the maze. The general approach is:

1. choose a cell from this list
2. select one of its uncarved neighbors
3. carve that neighbor
4. add the newly carved cell to the list
5. repeat from step 1

If a cell has no uncarved neighbors (at step 2), it is removed from the list and another cell is chosen (return to step 1). The maze is finished when there are no cells left in the list. To start the process, simply pick a cell at random from the (initial) block of uncarved cells (i.e. the maze). This algorithm can simulate several other algorithms and/or produce very interesting results by varying how the next cell is chosen from the list (in step 1). Some strategies include: most-recently added, oldest, or random. One can even blend strategies.

The other algorithm, meanwhile, “builds” walls by a process of (potentially infinite) sub-division. Hence, its name: Recursive Division. More formally:

Recursive Division

This wall-building algorithm starts with the maze as an empty, but bounded, area. It then proceeds as follows:

1. Pick a location at random to build a wall running either the height (vertically) or width (horizontally) of the bounded area (maze)
2. Place an opening randomly within the new wall
3. Recursively repeat from step 1 on each of the two subdivisions created by building the wall (e.g. step 1, but now the bounded area is a subset of the overall maze)
4. Halt the process when an (arbitrary) number of subdivisions has been reached

The orientation of the wall should be biased towards the proportions of a given (subdivision) area. In other words, an area where the height is twice the width should be divided horizontally more often than vertically (and vice-versa for inverse ratios). Please note, this algorithm “builds walls”, but the output format should describe passages. Therefore, an extra conversion step may be warranted.

At Lambda Jam, the first algorithm was chosen unanimously. So, while I’ll include my version of it in the full source for this post, I’d instead like to focus on the neglected Recursive Division approach to maze generation. Rather than take a “purely functional” approach, I chose to exploit F#’s multiparadigmatic nature. So, first step is to initialize an empty rectangular array.

let grid = Array.init height (fun _ -> Array.zeroCreate width)

The meat of the code though is a function which, given a starting point (x and y), a set of bounds (width and height), and an orientation (horizontally or vertically, via orientation),

let rec divide (grid:int[][]) x y width height orientation =

will “build” a wall by setting values in a slice of the aforementioned rectangular array (from x,y to wx,wy). Note, however, that a passage is left into a random location (px,py) during the building of the wall.

let mutable wx,wy = x + (if isHorizontal then 0 else rand.Next (width - 2))
                   ,y + (if isHorizontal then rand.Next (height - 2) else 0)
    
let px,py = wx + (if isHorizontal then rand.Next width else 0)
           ,wy + (if isHorizontal then 0 else rand.Next height)

let dx,dy = if isHorizontal then (1,0) else (0,1)

let length,dir = if isHorizontal then (width ,S) 
                                 else (height,E)

for _ in 1 .. length do
  if wx <> px || wy <> py then
    grid.[wy].[wx] <- grid.[wy].[wx] ||| dir
  wx <- wx + dx
  wy <- wy + dy

It then calls itself on each of the two new smaller sub-divisions created by building the wall.

let mutable nx,ny = x,y
let mutable w,h = 
  if isHorizontal then (width,wy-y+1) else (wx-x+1,height)
divide grid nx ny w h (chooseOrientation w h)

nx <- if isHorizontal then x      else wx + 1
ny <- if isHorizontal then wy + 1 else y
w  <- if isHorizontal then width               else x + width - wx - 1
h  <- if isHorizontal then y + height - wy - 1 else height
divide grid nx ny w h (chooseOrientation w h)

It could continue this recursive process indefinitely. However, the line

 
if width >= 2 && height >= 2 then

places a limit on the depth of the recursion. We don’t sub-divisions less than two cell high by two cells wide (a sensible-but-arbitrary limit). It’s also worth noting the call to chooseOrientation. This simple function balances the ratio of horizontal and vertical walls. It prefers horizontal walls for a region whose height is greater than it’s width and vertical walls when the relationship between width and height is inverted. If it can’t determine a relationship between the dimensions of a given area, it picks an orientation at random.

let chooseOrientation width height =
    if    width < height  then HORIZONTAL
    elif  width > height  then VERTICAL
    elif  rand.Next 2 = 0 then HORIZONTAL
    else                       VERTICAL

So, all together, the function buildMaze_RecursiveDivision looks like

let buildMaze_RecursiveDivision width height =
  let grid = Array.init height (fun _ -> Array.zeroCreate width)

  let HORIZONTAL,VERTICAL = 1,2
  
  let chooseOrientation width height =
    if    width < height  then HORIZONTAL
    elif  width > height  then VERTICAL
    elif  rand.Next 2 = 0 then HORIZONTAL
    else                       VERTICAL

  let rec divide (grid:int[][]) x y width height orientation =
    if width >= 2 && height >= 2 then
      let isHorizontal = (orientation = HORIZONTAL)

      let mutable wx,wy =
         x + (if isHorizontal then 0 else rand.Next (width - 2))
        ,y + (if isHorizontal then rand.Next (height - 2) else 0)
    
      let px,py = wx + (if isHorizontal then rand.Next width else 0)
                 ,wy + (if isHorizontal then 0 else rand.Next height)

      let dx,dy = if isHorizontal then (1,0) else (0,1)

      let length,dir = if isHorizontal then (width ,S) 
                                       else (height,E)

      for _ in 1 .. length do
        if wx <> px || wy <> py then
          grid.[wy].[wx] <- grid.[wy].[wx] ||| dir
        wx <- wx + dx
        wy <- wy + dy

      let mutable nx,ny = x,y
      let mutable w,h = 
        if isHorizontal then (width,wy-y+1) else (wx-x+1,height)
      divide grid nx ny w h (chooseOrientation w h)

      nx <- if isHorizontal then x      else wx + 1
      ny <- if isHorizontal then wy + 1 else y
      w  <- if isHorizontal then width               else x + width - wx - 1
      h  <- if isHorizontal then y + height - wy - 1 else height
      divide grid nx ny w h (chooseOrientation w h)
  divide grid 0 0 width height (chooseOrientation width height)
  grid

At this point (and this blog entry is getting rather lengthy), we’ve mixed the functional and procedural qualities of F# to produce a reasonable implementation of a clever algorithm. But we’re not quite done yet. The maze definition generated by this function describes the location of walls. However, our visualization function expects to be told where the passages are — total mismatch! So, one last little routine is needed.

let translate (x,y) cell =
  let mutable cell' = cell
  if y = 0          then cell' <- cell' ||| N
  if y+1 >= height  then cell' <- cell' ||| S
  if x = 0          then cell' <- cell' ||| W
  if x+1 >= width   then cell' <- cell' ||| E
  All - cell'

This function simply examines the walls in a cell, adding wall values (via bit-wise OR’ing) for being at the “edges” of the overall maze. It then determines the corresponding passages by subtracting the (possibly adjusted) cell value from the total possible number of passages (again via a bitwise OR’ing of values: All = 15 = 1 OR 2 OR 4 OR 8). So, in the end, passages exist where the walls aren’t. We can call this on each cell, in turn, to convert an entire maze

built |> Array.mapi (fun y row  ->
  row |> Array.mapi (fun x cell -> translate (x,y) cell))

from a “built” defintion to a “carved” definition.

Now, we can put it all together. Calling

let mazeRD = buildMaze_RecursiveDivision 10 10
(builtToCarved >> viewMaze 30) mazeRD

produces the final output. I’ve found the whole subject of maze generation to be fascinating. Hopefully, if you’ve read this far, you do as well. As always, thanks for reading. I look forward to your comments.

And happy coding.


The complete source for this article.

open System
open System.Collections.Specialized
open System.Drawing
open System.Drawing.Imaging
open System.IO
open System.Net
open System.Text
open System.Windows.Forms

(* directions -- passages or walls *)
let [<Literal>] Nil =  0
let [<Literal>] N   =  1 
let [<Literal>] S   =  2 
let [<Literal>] E   =  4 
let [<Literal>] W   =  8 
let [<Literal>] All = 15

type date = System.DateTime
let  rand = System.Random(date.Now.Millisecond)

(* maze generation utilities *)
let builtToCarved built = 
  let height = Array.length built
  let width  = ((Array.map Array.length) >> Array.max) built
  let translate (x,y) cell =
    let mutable cell' = cell
    if y = 0          then cell' <- cell' ||| N
    if y+1 >= height  then cell' <- cell' ||| S
    if x = 0          then cell' <- cell' ||| W
    if x+1 >= width   then cell' <- cell' ||| E
    All - cell'
  built |> Array.mapi (fun y row  ->
    row |> Array.mapi (fun x cell -> translate (x,y) cell))

let render cellSize maze =
  
  let hasWall direction cell = 
    //NOTE: direction indicates a passage _out_ of the cell.
    //      thus, not having a passage equals having a wall.
    cell &&& direction = 0
  
  let mazeHeight  = Array.length maze 
  let mazeWidth   = ((Array.map Array.length) >> Array.max) maze
  let visWidth
      ,visHeight   = cellSize * mazeWidth
                    ,cellSize * mazeHeight
  let mazeImage   = new Bitmap(visWidth,visHeight)
  use borderPen   = new Pen(Brushes.Black,2.0f)
  let borderRect  = Rectangle(1,1,visWidth - 2,visHeight - 2)
  use graphics    = Graphics.FromImage mazeImage
    
  let drawCell (x,y) cell =
    let x1,y1 = cellSize * x,cellSize * y
    let x2,y2 = x1 + cellSize,y1 + cellSize
      
    if cell |> hasWall N then graphics.DrawLine(Pens.Black,x1,y1,x2,y1)
    if cell |> hasWall S then graphics.DrawLine(Pens.Black,x1,y2,x2,y2)
    if cell |> hasWall E then graphics.DrawLine(Pens.Black,x2,y1,x2,y2)
    if cell |> hasWall W then graphics.DrawLine(Pens.Black,x1,y1,x1,y2)

  // draw outer "bounds" of maze (with offset to prevent clipping)
  graphics.DrawRectangle(borderPen,borderRect) 
    
  // loop through rows and cells, drawing the walls of each cell
  maze |> Array.iteri (fun y row  ->
    row |> Array.iteri (fun x cell -> drawCell (x,y) cell))

  mazeImage

let viewMaze cellSize maze =  
  use visual  = render cellSize maze
  use viewer  = new Form(Text                  = "Maze Jam: Maze Viewer"
                        ,Width                 = visual.Width
                        ,Height                = visual.Height
                        ,BackColor             = Color.White
                        ,BackgroundImage       = visual
                        ,BackgroundImageLayout = ImageLayout.Center)
  
  viewer.ShowDialog() |> ignore

(* Growing Tree *)
let buildMaze_GrowingTree nextIndex height width =
  let maze = Array.init height (fun _ -> Array.zeroCreate width)
  
  let DX      = dict [(E,+1);(W,-1);(N, 0);(S, 0);]
  let DY      = dict [(E, 0);(W, 0);(N,-1);(S,+1);]
  let INVERSE = dict [(E,W)
                      (W,E)
                      (N,S)
                      (S,N)]
    
  let inMaze (x,y) = x >= 0 && y >= 0 && x < width && y < height
  let unseen (x,y) = inMaze (x,y) && maze.[y].[x] = Nil
  
  let cells = ResizeArray()
  cells.Add(rand.Next width,rand.Next height)

  let rec loop () =
    match cells.Count with
    | n when n > 0 ->
        let index = nextIndex n
        let cx,cy = cells.[index]
        let direc = [ N; S; E; W; ]
                    |> List.sortBy (fun _ -> rand.Next 4)
                    |> List.tryFind(fun d -> let nx = cx + DX.[d]
                                             let ny = cy + DY.[d]
                                             unseen (nx,ny))
        match direc with
        | Some dir  ->  let nx,ny = cx + DX.[dir], cy + DY.[dir]
                        maze.[cy].[cx] <- maze.[cy].[cx] ||| dir
                        maze.[ny].[nx] <- maze.[ny].[nx] ||| INVERSE.[dir]
                        cells.Add (nx,ny)
        | None      ->  cells.RemoveAt index
        loop ()
    | _ ->  // no more unvisited cells, maze is built
            maze
  loop () 

(* Recursive Division *)
let buildMaze_RecursiveDivision width height =
  let grid = Array.init height (fun _ -> Array.zeroCreate width)

  let HORIZONTAL,VERTICAL = 1,2
  
  let chooseOrientation width height =
    if    width < height  then HORIZONTAL
    elif  width > height  then VERTICAL
    elif  rand.Next 2 = 0 then HORIZONTAL
    else                       VERTICAL

  let rec divide (grid:int[][]) x y width height orientation =
    if width >= 2 && height >= 2 then
      let isHorizontal = (orientation = HORIZONTAL)

      let mutable wx,wy =
         x + (if isHorizontal then 0 else rand.Next (width - 2))
        ,y + (if isHorizontal then rand.Next (height - 2) else 0)
    
      let px,py = wx + (if isHorizontal then rand.Next width else 0)
                 ,wy + (if isHorizontal then 0 else rand.Next height)

      let dx,dy = if isHorizontal then (1,0) else (0,1)

      let length,dir = if isHorizontal then (width ,S) 
                                       else (height,E)

      for _ in 1 .. length do
        if wx <> px || wy <> py then
          grid.[wy].[wx] <- grid.[wy].[wx] ||| dir
        wx <- wx + dx
        wy <- wy + dy

      let mutable nx,ny = x,y
      let mutable w,h = 
        if isHorizontal then (width,wy-y+1) else (wx-x+1,height)
      divide grid nx ny w h (chooseOrientation w h)

      nx <- if isHorizontal then x      else wx + 1
      ny <- if isHorizontal then wy + 1 else y
      w  <- if isHorizontal then width               else x + width - wx - 1
      h  <- if isHorizontal then y + height - wy - 1 else height
      divide grid nx ny w h (chooseOrientation w h)
  divide grid 0 0 width height (chooseOrientation width height)
  grid

(* invocation *)
let mazeGT = buildMaze_GrowingTree (fun n -> n - 1) 10 10
viewMaze 30 mazeGT

let mazeRD = buildMaze_RecursiveDivision 10 10
(builtToCarved >> viewMaze 30) mazeRD
Community

Progressive F# Tutorials NYC 2013

So, it’s a new blog theme, and it’s my first post in nearly a year (hey?! I’ve been busy). And it’s really exciting!


I’m thrilled that the folks at SkillsMatter are bringing the Progressive F# Tutorials back to the Big Apple for 2013. This two-day training is the event to attend if you have any interest what-so-ever in F#. It’s not a conference. It’s two days of in-depth learning. In fact, last year, my company’s sister company sent one of their engineers, who had never worked with F# before, to attend the beginner track. He learned enough in two days that, upon returning to the office, he promptly re-wrote a complex piece of MatLab code into a smaller, faster, less-costly F# solution. What’s more, many of the best and brightest minds in the F# community will be on-hand (some guiding. some just soaking up the knowledge). So, if you’re anywhere near NYC on 18-19 September, you should attend. The official announcement follows. It contains more details, links, and a discount code. I’ll see you there!


On the back of the success of the 2013 edition, the Progressive F# Tutorials return to New York in September – this time packing an even bigger punch! With F# UG lead Rick Minerich at the helm, we’ve put together a expert filled line-up – featuring Don Syme (creator of F#), Tomas Petricek, and Miguel de Icaza. The Tutorials will be split in two – a beginners track for those eager to unleash F#’s full power, and a ‘meaty track’ for those more experience f#pers amongst you! Each session will be a 4 hour hands-on deep dive, brushing aside the traditional format of conferences to allow you to truly immerse into the subject topic.

Want to get involved? We’re giving a special community 20% discount!

Just go ahead and enter SkillsMatter_Community on the booking form and the team at Skills Matter will look forward to welcoming you to the Progressive F# Tutorials NYC this September!

Source Code

Managing Pointers, or F#’s Platform-Invoke “Gotcha”

I love how well F# “plays” with other languages. This is obviously true where its in-the-box .NET siblings are concerned. However, over the past few years, I’ve come to find it is just as seamless when mixed with good old-fashioned C code.

Well, it’s nearly as seamless.

For the most part, when using P/Invoke, one may simply copy-and-paste the signatures from a C header file (sans-semi-colons, of course). However, there is at least one scenario where naïvely doing so produces code which is not verifiably type-safe. Let’s look at a specific example. Given the follow function prototype in C:

__declspec(dllexport) void getVersion (int* major, int* minor, int* patch);

One might use the following P/Invoke signature (and associated call) in F#:

[<DllImport("somelib",CallingConvention=CallingConvention.Cdecl)>]
extern void getVersion (int* major, int* minor, int* patch)

// ...

let mutable major,minor,patch = 0,0,0
getVersion(&&major,&&minor,&&patch)
printfn "Version: %i.%i.%i" major minor patch

At this point, everything will compile and run just fine. So where’s the problem? To find out, we have to turn to an under-utilised tool in the .NET developer’s arsenal — PEVerify.



SIDEBAR — What the heck is PEVerify?

According to MSDN:

The PEVerify tool helps developers… to determine whether their MSIL code and associated metadata meet type safety requirements. Some compilers generate verifiably type-safe code only if you avoid using certain language constructs.

That’s great. And the reason we want “verifiably type-safe code” is?

Also, according to MSDN:

The common language runtime relies on the type-safe execution of application code to help enforce security and isolation mechanisms. Normally, code that is not verifiably type-safe cannot run, although you can set security policy to allow the execution of trusted but unverifiable code.

and elsewhere:

Type-safe code accesses only the memory locations it is authorized to access. (For this discussion, type safety specifically refers to memory type safety and should not be confused with type safety in a broader respect.) For example, type-safe code cannot read values from another object’s private fields. It accesses types only in well-defined, allowable ways.

So, clearly, “verifiably type-safe” is a distinction worth having. And PEVerify is a tool worth knowing.



And now back to our program (already in progress)…

Running the tool gives us the following output (reformatted for readability):

Microsoft (R) .NET Framework PE Verifier. Version 4.0.30319.1
 Copyright (c) Microsoft Corporation. All rights reserved.

[IL]: Error:
 [C:\dev\somelibfs.dll : .$Constants::.cctor]
 [mdToken=0x600008f][offset 0x0000000D]
 [found address of Int32][expected unmanaged pointer] 
 Unexpected type on the stack.

[IL]: Error:
 [C:\dev\somelibfs.dll : .$Constants::.cctor]
 [mdToken=0x600008f][offset 0x0000000D]
 [found address of Int32][expected unmanaged pointer]
 Unexpected type on the stack.

[IL]: Error:
 [C:\dev\somelibfs.dll : .$Constants::.cctor]
 [mdToken=0x600008f][offset 0x0000000D]
 [found address of Int32][expected unmanaged pointer]
 Unexpected type on the stack.

3 Error(s) Verifying C:\dev\somelibfs.dll

Clearly, something isn’t right.

It’s not very obvious, but the big clue is where PEVerify tells us it was expecting an unmanaged pointer. Turns out, when dealing with the CLR, there are two types of pointers: unmanaged and managed. The later are what you use when passing around CLR types by-reference (i.e. “byref<‘T>“ in F#, or “ref“ in C#, or “ByRef“ in VB). It also happens that you should use the managed variety if you want your F# code to be verifiably type-safe — and this includes P/Invoke calls. If you think about it, this makes sense. The runtime can only guarantee the bits it can control (i.e. the parts which are “managed”). So here’s what the F# code looks like using managed pointers instead:

[<DllImport("somelib",CallingConvention=CallingConvention.Cdecl)>]
extern void getVersion (int& major, int& minor, int& patch)

// ...

let mutable major,minor,patch = 0,0,0
getVersion(&major,&minor,&patch)
printfn "Version: %i.%i.%i" major minor patch

And, if we run PEVerify on the updated code, we get the following report:

Microsoft (R) .NET Framework PE Verifier.  Version  4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

All Classes and Methods in C:\dev\somelibfs.dll

That’s much better!

So, to recap, there are two types of pointers, as summarized in the following table:

Pointer F# Type Declaration Invocation
Unmanaged nativeint <type>* &&<type>
Managed byref <type> <type>& &type

In nearly all cases, a .NET developer should prefer the managed pointer. Leave the unmanaged risks with your C code.



I’d like to give special thanks to Jack Pappas, for finding (and helping me to understand and vanquish) this issue in fszmq.

Presentations

Double Trouble

Updated 23 Jan 2012

A nice recording of my presentation, along with other NYCFSUG talks, can be found here.


Updated 30th Sep 2011

Unfortunately, due to (highly disappointing) hurricane Irene, the New York City F# Users’ Group meeting has been rescheduled. Please check out this link for the details.


Shameless plug!!!

Come check out the next New York City F# Users’ Group meeting on Tuesday 30th August 2011 at 18:30 hours. It’s a double-header featuring two guys from Jersey — Steve Goguen and yours truly. Click the link below for all the pertenant details. It’s gonna be a blast!

Double Trouble with Paulmichael Blasucci and Steve Goguen

Shameless plug!!!