Community, Random

Scratching a 7-Year Itch

Today marks fszmq’s seventh birthday. Or maybe call it an anniversary? Very technically, the library’s a few weeks older than this. But seven years ago — on 2 May 2011 — I put the source code on GitHub.com, which officially introduced the project to the world. In observance of this milestone, I thought I’d talk a little about its origins, what I’ve learned over the years, and speculate a bit on what the future might hold for fszmq.


What, exactly, is fszmq?

fszmq is an MPLv2-licensed CLR (e.g. Mono, .NET) binding to the ZeroMQ distributed computing library. ZeroMQ, meanwhile, provides a wide range of building blocks for developing high-performance, message-passing systems.


Waaaaay Back When (in 2011)

To set the stage, let’s note that seven years ago I was in NYC, doing front-office development for a hedge fund. Ostensibly, I was building a medium-volume, gray-box trading system. But practically, I spent my time juggling loads of little tools needed for various business functions. And it had become painfully clear that we needed a better way of getting various services to coordinate. HTTP had too much overhead. WCF was too… well, WCF — with all the over-engineered pains one might expect. So we looked at various pieces of middle-ware. The big servers (MQSeries, TIBCO, et cetera) were well out of scope, due to time and budget constraints. The more approachable products (MSMQ, NServiceBus, et cetera) raised more than few performance and maintenance concerns. Then, one day, a colleague of mine mentioned a buddy of his (at another hedge fund) raved about a new tool they were using for high-frequency trading systems: ZeroMQ. In need of a solution, I took ZeroMQ for a test drive. I was very impressed by the speed, the flexibility and relative simplicity of the API, and the superb support. I was sold on a light-weight, decentralized messaging library. The only catch? I was not enjoying a return to C/C++ code. Fortunately, I had enough use-cases, with sufficient latency/throughout tolerance, that a managed wrapper was feasible. In fact, there was even a nascent version of a (now defunct) C# wrapper for libzmq (the native C library which serves at the canonical implementation of the ZMT protocol). However, the wrapper was buggy. And it had a fairly clumsy, poorly documented API. And was not receiving much support. So, rather than run back to C/C++, I decided to build my own wrapper. And because I was so taken by F# (at that point, I’d been using it for 4 years — 2 casually and then 2 professionally), I decided to put its P/Invoke support to the test. My initial results were solid. So, I put the work out on GitHub. Thus, fs-zmq (note the hyphen — now a long-forgotten vestige) came into existence, born out of equal parts need and curiosity.

Fast forward seven years. I think the library has held up fairly well. Certainly, it’s seen regular production use in a few different companies, both in and out of the financial sector. It’s also moved from my personal project to being part of the ZeroMQ organization. And it has received love and attention from nearly a dozen contributors over the years.

fszmq Milestones
Major events in the history of fszmq

The Good Parts of fszmq

But I want to take this opportunity to focus on the API. Growing fszmq has taught me a lot about the craft of library-as-interface. It has been especially instructive regarding the mixing of languages, both in terms of working with C from F# and in terms of providing a friendly experience to C# (or VB) consumers. As I’ve written about the P/Invoke “gotchas” elsewhere, I’d now like to talk about the aspects of the managed API which I think were rather successful.
It terms of a model, fszmq is primarily three types: Context, Socket, and Message. Each of these loosely follows the same pattern. A core type (Socket, et cetera) is primarily concerned with managing native memory safely and efficiently. I say “primarily” because the Context type is a little bit… well, special. In addition to managing its own native memory, a Context instance also tracks Socket instances so it may participate in program termination. This is unfortunately needed to get clean shutdown semantics (and is a long-standing pain point for wrappers of libzmq). Socket and Message instances, however, truly are only focused on allocating and releasing their own native memory. Meanwhile, the operations which define the meat of fszmq are all stateless functions which take an instance of a core type, e.g. Message (which is treated as an opaque token). This separation of data from behavior leads to a few interesting consequences. It aligns very closely to the underlying C API, reducing cognitive load. It also means adding individual behaviors is more isolated (and thus, testable). It also helps to increase the composability of certain workflows. If you think this sounds vaguely like the “abstract data type” pattern, you’re not wrong. The overall design is inspired by it. Although, it may be argued that fszmq deviates in some ways. At any rate, I’ve previously written a more detailed review (in case you’re interested).

There are two other aspects of the API which I feel are worth noting: cross-language support and documentation. There is an explicit effort throughout the code base to present a friendly and usable interface to non-F# consumers. I’ve written more generally about F# API design elsewhere. But, for fszmq this means several things:

  • Operational functions often serve “double duty” as extension methods, so as to appear like instance methods in C#.
  • In some places, additional overloads are given to convert between F#-centric types and more common BCL types.
  • CompiledNameAttribute is applied — judiciously — to help ensure discoverability when callers navigate the API.
  • An assembly-level ExtensionAttribute is emitted so that VB will properly detect extension methods, offering consumers in that language another possible means of expression.

Finally, I’ve always been impressed with the amount of genuinely useful documentation produced for libzmq. And I’m pleased that fszmq has tried to achieve something similar. Originally, this meant contributing to the ZGuide, an in-depth multi-language tour of ZeroMQ. However, in recent years, it has come to mean having comprehensive API documentation and instructional examples hosted along-side the actual fszmq repository.

Recent statistics on fszmq

The Not-So-Good Parts of fszmq

Of course, all libraries have warts, and fszmq is no exception. Seven years on, the API is showing its age in a few places. Additionally, there are a few things I’d do rather differently (if only I knew then what I know now). What follows is a partial airing of grievances, in rough order of how badly things irk me. Hopefully, not all of them have calcified to the point of permanence (but most of them have).

I should’ve (somehow) automated the code for getting and setting options on the core types — especially on Socket instances. It’s not particularly complicated code. But it’s tedious. And requires tweaking every time libzmq adds or deprecates an option (which is nearly every release). Ideally, I’d handle it with a type provider. But barring that (since they didn’t exist when I first started fszmq), I should’ve made code generation part of the build process (T4, GSL, et cetera).

Errors are a normal part of software development. But fszmq could handle them a lot better. There are several places where monadic error handling (i.e. railway-oriented programming) would enable much simpler and more composable workflows. There are also some places where errors really ought to be flat out swallowed. In these cases, it’s entire possible to anticipate and adapt. Finally, even in the places where raising an exception is the best course of action, fszmq could do better. Most times the underlying (native) error is just “passed through”. Encoding these errors into meaningful sub-classes of System.Exception (with any relevant meta-data) would provide a much better experience for consumers.

The polling API is too cumbersome. Polling is an essential technique in real-world applications. Yet the fszmq API — despite being reactive — feels very clunky. It requires complex mutable state management and closures in all but the simplest of scenarios. Now, admittedly, at least receiving scenarios can use tryPollInput (or TryGetInput, depending on the programming language). But even there you have a bit of overhead and no way to tune Message instance allocations.

Finally, how many custom operators is too many? In the case of fszmq — all of them! There’s no need for any of them. They just confuse things and add more surface area to maintain.


The Best Part of fszmq

Over the years I’ve come to appreciate a core strength of ZeroMQ is its tremendous community. And that certainly includes the following contributors who each had a hand in shaping fszmq. They forever have my respect and gratitude. Thanks!


Looking beyond 2017

So what’s next for fszmq? Well, there’s the usual tinkering (bug patching, keeping up with changes to libzmq, et cetera). But I’ve also begun producing content for an on-line video-based training course, which aims to get C# developers quickly building distributed solutions with ZeroMQ. There are also definite plans to further extend the documentation based on some community feedback. And I’ve begun work on adding a limited amount of integration between fszmq and F#’s async workflows. Also, some early-stage (mostly experimental) work was recently begun to make fszmq run on .NET Core. But really, I’m hoping to grow the contributors to fszmq. So why not check out the issue tracker and hack something up? Or at the very least, leave a suggestion for anything you think needs to be added or addressed in the library. I want to here from you!

Anyway, thanks for reading. And here’s to another seven years!


In Memoriam: Pieter Hintjens (3 December 1962 – 4 October 2016)

Nothing I’ve done with fszmq would’ve been possible without the vision and encouragement of Pieter. I’m honored to have collaborated (however briefly) with him. And no one I’ve met in my 18-year career has had a greater impact on my work — or my world view. I consider him a teacher. Hopefully, I have been and continue to be a worthy student.


 

Community, Presentations

F# Exchange 2017: Is There Such a Thing as Too Much Awesomeness?

I’m extremely pleased to be attending this year’s F# Exchange (6-7 April 2017) The program is very nearly finalized and the content looks amazing. In fact, this is shaping up to be one of those rare conferences where, no matter which sessions I choose to attend, I’m sure to be missing some fantastic presentations. Of course, it doesn’t hurt that I’ll get to catch up with friends both old and new. I’m also looking forward to finally meeting some “online friends” in real life. But I wanted to take this opportunity to highlight some of the topics on which I’ll be presenting…

Many people will tell you how cool the F# language is (and rightfully so). But it obviously takes more than just coolness to build great software. It takes high-quality tools. So, in April, I’ll be talking about two such tools:

Though really, these libraries are just “F# friendly” ways of plugging into broader concepts (property-based testing and ZMTP-based distributed systems, respectively).

Test What Now?

Fans of property-based testing (sometimes called random testing) will tell you how it lets you ensure your code “upholds invariants” (in the mathematical sense). That’s great. Really. And I’ll certainly be demonstrating some of that. But I really hope attendees will learn how to adopt a metrics-focused view of testing. We’ll also be looking how random data-generation can help you better understand a problem domain. Taken together, this provides a more robust foundation for quality assurance.

Connect All the Things!

We live in a connected world. ZMTP provides a simple, robust means of developing software in such a world. And while I could spend hours exploring the nuances of this topic, Skill Matter’s only given me 45 minutes. But that should be plenty of time to demonstrate the power and the potential of using ZeroMQ to build distributed systems. The concepts and patterns we’ll cover are the building blocks for all manner of solutions, from micro services to time-series databases to peer-mesh file sharing.

In conclusion

Hopefully, you’ll be at Skill Matter‘s F# Exchange 2017. It’s going to be a really incredible couple of days. If you do attend, please don’t hesitate to say hello. I can hardly wait to hear your thoughts on anything (and everything) F#.

Community, Source Code

A Mixed-Paradigm Recipe for Exposing Native Code

(Note: this post assumes some familiarity with either .NET or Mono… it’s also going to help if you’ve worked with C#, VB, or F# before.)


F# is frequently called a “functional first” programming language. Don Syme, creator of the language, has explained it thus:

Functional-first programming uses functional programming as the initial paradigm for most purposes, but employs other techniques such as objects and state as necessary.

However, the simplicity of this statement belies the tremendous power and flexibility of the language. This is seldom more apparent than when trying to wrap unmanaged libraries in F# code. In fact, we may combine two different approaches — one common to OO languages and the other popularized by pure functional programming — into a sort of recipe for wrapping native functionality in F#. Specifically, we’ll bring together deterministic resource management[1][2] with the notion of abstract data types[3][4]. As a case study for exploring this, we’ll look at the fszmq project.


Sidebar: What is fszmq?

fszmq is an MPLv2-licensed CLR (e.g. Mono, .NET) binding to the ZeroMQ distributed computing library. ZeroMQ, meanwhile, provides a complete library of building blocks for developing high-performance, message-passing systems.

fszmq is primarily concerned with Sockets which pass stateless Messages to one another. These messages are comprised of 1 or more frames of 0 or more bytes. fszmq makes no demands on the actual representation of message data. Sockets exchange messages in well-defined patterns which provide proven semantics on which to build distributed systems. Additionally, sockets provide (inaccessible to application code) inbound and outbound in-memory message queues. This makes centralization optional rather than mandatory. Sockets also provide a uniform abstraction over various transport protocols, the most popular of which are In-Process (i.e. threads), IPC, TPC, and PGM. Finally, a Context groups together a collection of sockets into a logically distinct “node”. There is typically one context instance per OS-level process.

A simple example of a server, which receives updates from a client, and then replies with an acknowledgement might look as follows:

// create, configure Context, Socket instances
use context = new Context ()
use server  = router context
Socket.bind server "tcp://eth0:5555"

while not hook.IsCancellationRequested do
  let msg    = Socket.recvAll server
  let sender = Array.get msg 0
  // actual work would go here
  [| sender; 0x00uy |] |> Socket.sendAll server 

For more information on getting started with fszmq and ZeroMQ please visit:

And now, back to the main feature…


F# code is subject to garbage collection, just like any other CLR language. This poses particular issues when working with unmanaged resources, which — by definition — are outside the purview of a garbage collector. However, we can take two explicit steps to help manage this. First, we define a type whose (ideally non-public) constructor initializes a handle to unmanaged memory:

type Socket internal (context,socketType) =
  let mutable disposed  = false // used for clean-up
  let mutable handle    = C.zmq_socket (context,socketType)
  //NOTE: in fszmq, unmanaged function calls are prefixed with 'C.'
  do if handle = 0n then ZMQ.error ()

Then, we both override object finalization (inherited from System.Object) and we implement the IDisposable interface, which allows us to control when clean-up happens:

  override __.Finalize () =
    if not disposed then
      disposed <- true // ensure clean-up only happens once
      let okay = C.zmq_close handle
      handle <- 0n
      assert (okay = 0)

  interface IDisposable with

    member self.Dispose () =
      self.Finalize ()
      GC.SuppressFinalize self // ensure clean-up only happens once

With our creation and destruction in place, we’ve made a (useless, but quite safe) managed type, which serves as an opaque proxy to the unmanaged code with which we’d like to work. However, as we’ve defined no public properties or methods, there’s no way to interact with instances of this type.

And now abstract data types enter into the scene.

Ignoring the bits which pertain to unmanaged memory, our opaque proxy sounds an awful lot like this passage about abstract data types:

[An ADT] is defined as an opaque type along with a corresponding set of operations… [we have] functions that work on the type, but we are not allowed to see “inside” the type itself.

This would exactly describe our situation… if only we had some functions which could manipulate our proxy. Let’s make some!

For the sake of navigability, we group the functions into a module with the same name as the type they manipulate. And the implementations themselves mostly invoke unmanaged functions passing the internal state of our opaque proxy.

module Socket =

  let trySend (socket:Socket) (frame:byte[]) flags =
    match C.zmq_send(socket.Handle,frame,unativeint frame.Length,flags) with
    | Message.Okay -> true
    | Message.Busy -> false
    | Message.Fail -> ZMQ.error()

  let send socket frame = 
    Message.waitForOkay (fun () -> trySend socket frame ZMQ.WAIT)

  let sendMore socket frame : Socket =
    Message.waitForOkay (fun () -> trySend socket frame (ZMQ.WAIT ||| ZMQ.SNDMORE))
    socket

  //NOTE: additional functions elided, though they follow the same pattern

And that’s primarily all there is to this little “recipe”. We can see from the following simple example how our opaque proxy instances are a sort of token which provides scope as it is passed through various functions calls.

// create our opaque Socket instance
use client = dealer context 
//NOTE: the `use` keyword ensures `.Dispose()` is called automatically

// configure opaque proxy
Socket.connect client "tcp://eth0:5555"

// ... elsewhere ...
// send a message
date.Stamp () |> Socket.send client

// recv (and log) a message
client 
|> Socket.tryPollInput 500<ms> // timeout
|> Option.iter logAcknowledgement

Now, we could stop here. However, this clean and useful F# code will feel a bit clumsy when used from C#. Specifically, in C# one tends to invoke methods on objects. Also, the tendency is for PascalCase when naming public methods. Fortunately — as an added bonus — we can accommodate C# with only minor decoration to our earlier code. We’ll first add an ExtensionAttribute to our module. This tells various parts of the CLR to find extension methods inside this module.

[<Extension>]
module Socket =

And then we add two attributes to each public function. The ExtensionAttribute allows our function to appear as a method on the opaque proxy (when used from C#). Meanwhile, the CompiledNameAttribute ensures that C# developers will be presented with the naming pattern they expect. Calling the code from F# remains unaltered.

  [<Extension;CompiledName("SendMore")>]
  let sendMore socket frame : Socket =
    Message.waitForOkay (fun () -> trySend socket frame (ZMQ.WAIT ||| ZMQ.SNDMORE))
    socket

  //NOTE: additional functions elided, though they follow the same pattern

Now C# developers will find it quite straight-forward to use the code… and we’ve maintained all the benefits of both deterministic resource management and abstract data types.

// create our opaque Socket instance
//NOTE: the `using` keyword ensures `.Dispose()` is called automatically
using(var client = context.Dealer())
{
  // configure opaque proxy
  client.Connect("tcp://eth0:5555");
 
  // ... elsewhere ...
  // send a message
  client.Send(date.Stamp());

  // recv (and log) a message
  var msg = new byte[0];
  if(client.TryGetInput(500,out msg)) logger.Log(msg);
}

By combining useful techniques from a few different “styles” of programming, and exploiting the rich, multi-paradigm capabilities of F#, we are able to provide simple, robust wrappers over native code.


TL;DR…

A Mixed-Paradigm Recipe for Exposing Native Code

  1. Make a managed type with no public members, which proxies an unmanaged object
    • Initialize native objects in the type’s constructor
    • Clean-up native objects in the type’s finalizer
    • Expose the finalizer via the IDisposable interface
  2. Use the abstract data type pattern to provide an API
    • Define functions which have “privileged” access to the native objects inside the opaque type from step #1
    • Group said functions into a module named after the opaque type from step #1

Bonus: make the ADT friendly for C# consumers

  • Use ExtensionAttribute to make ADT functions seem like method calls
  • Use CompiledNameAttribute to respect established naming conventions

(This post is part of the 2015 F# Advent.)

Community, Presentations

The Month NYC Ran Out of Excuses Not to Learn F#

It’s Official! I’m totally declaring September 2013 F# Month (in New York City, at least). I mean, here we are — not seven days into the month yet — and I’ve got a veritable cornucopia (note the reference to impending autumn) of events for you to attend. For starters, on Tuesday 10 September, I’ll be in Parsippany, NJ, at the Northern New Jersey .NET Users’ Group, for an introduction to F#. Then, a few days later, on Saturday 14 September, it’s back to mid-town for Code Camp NYC 2013. This day-long event features several F# talks (one of which will be given by yours truly). Pushing into the following week, on Tuesday 17 September, the inimitable Rachel Reese will present to the NYC F# Users’ Group on actor-based concurrency in F# (also in mid-town). Next up: Wednesday 18 September and Thursday 19 September see the return of Skill Matter’s Progressive F# Tutorials NYC, in Brooklyn’s lovely DUMBO neighborhood. This intense two-day hands-on learning-fest is more than a mere conference. It’s chok-full of in-depth education, taught by many of the foremost minds in the F# community (minds like Don Syme, Tomas Petricek, Phil Trelford, and Rick Minerich … to name but a few). Then, as an added bonus, as if this wasn’t enough, Vermont Code Camp 5 happens, on Saturday 21 September, in beautiful Burlington, VT. It’s not too far from New York, and features a couple of F# talks (including — you guessed it — one of mine). So, that’s five events, spread over two weeks, offering more F#-y goodness than you thought possible. Many of these events are free. Most still have tickets available. All of them will be awesome. So, if you’ve had even a passing interest in F#, you’ve just run out of excuses. See you there!

(Note: Did I miss something? Leave it in the comments and I’ll update this post as necessary. Thanks!)

Community

Progressive F# Tutorials NYC 2013

So, it’s a new blog theme, and it’s my first post in nearly a year (hey?! I’ve been busy). And it’s really exciting!


I’m thrilled that the folks at SkillsMatter are bringing the Progressive F# Tutorials back to the Big Apple for 2013. This two-day training is the event to attend if you have any interest what-so-ever in F#. It’s not a conference. It’s two days of in-depth learning. In fact, last year, my company’s sister company sent one of their engineers, who had never worked with F# before, to attend the beginner track. He learned enough in two days that, upon returning to the office, he promptly re-wrote a complex piece of MatLab code into a smaller, faster, less-costly F# solution. What’s more, many of the best and brightest minds in the F# community will be on-hand (some guiding. some just soaking up the knowledge). So, if you’re anywhere near NYC on 18-19 September, you should attend. The official announcement follows. It contains more details, links, and a discount code. I’ll see you there!


On the back of the success of the 2013 edition, the Progressive F# Tutorials return to New York in September – this time packing an even bigger punch! With F# UG lead Rick Minerich at the helm, we’ve put together a expert filled line-up – featuring Don Syme (creator of F#), Tomas Petricek, and Miguel de Icaza. The Tutorials will be split in two – a beginners track for those eager to unleash F#’s full power, and a ‘meaty track’ for those more experience f#pers amongst you! Each session will be a 4 hour hands-on deep dive, brushing aside the traditional format of conferences to allow you to truly immerse into the subject topic.

Want to get involved? We’re giving a special community 20% discount!

Just go ahead and enter SkillsMatter_Community on the booking form and the team at Skills Matter will look forward to welcoming you to the Progressive F# Tutorials NYC this September!