Launching workflows and River Pro

We launched River only nine months ago1, and it’s been a wild ride. The project has been well received by the Go community, gathering over 3k GitHub stars and bringing thousands of website visitors per month. We’ve been thrilled to see it adopted by so many of you—ranging from small startups to publicly traded companies handling millions of jobs per day.

We’ve been working hard to make River the best job processing library for Go, and I’m excited to announce the next step in that journey: workflows. Last but not least, I’m launching River Pro, a new subscription model for River which I hope will secure the further development and maintenance of the library.

Workflows

Although basic background jobs are a sufficient primitive to represent a large majority of application uses, it’s often desirable to express work as a more complex set of tasks. For example, a job is inserted initially, and upon its successful completion, two new phases of work are kicked off. If both of those succeed then a final job is inserted to finish the flow.

River now supports workflows, a feature that lets a series of jobs be modeled as a directed acyclic graph (DAG) where each task in the process waits for its dependencies to finish before continuing.

Breaking large jobs into small workflow tasks has a number of benefits:

  • Each individual task becomes simpler to understand and to easier to implement correctly in a retryable, idempotent fashion.

  • When a task errors, only that portion needs to be retried. Completed tasks stay complete, and the amount of repeat work is minimized.

  • Tasks are less likely to be repeated for the same reason. Critical sections that are sensitive to retries because they’re either resource intensive or produce external side effects can be separated out from sections that are more likely to fail. Each has its own retry policy for granular control.

  • Sections that are parallelizable can be broken into separate jobs that flow as dependencies into a single task that finalizes/aggregates work.

  • More granular tasks makes the overall workflow more observable. An operator can see which jobs are complete, which are in progress, and which still need to be worked.

Defined with code

Workflows are defined in code using the same thoughtful, Go-inspired syntax seen elsewhere in River. Workflow tasks are defined as job args, and dependencies are defined using strongly typed variables so it’s easy to understand a workflow’s structure using LSP tooling like jump-to-definition in your favorite IDE.

workflow := riverworkflow.New(&riverworkflow.Opts{Name: "My first workflow"})
// Add a first task to the workflow, named "a":
taskA := workflow.Add("a", MyJobArgs{}, nil, nil)
// Fan-out to tasks b1 and b2, which both depend on task a:
taskB1 := workflow.Add("b1", MyJobArgs{}, nil,
&riverworkflow.TaskOpts{Deps: []string{taskA.Name}})
taskB2 := workflow.Add("b2", MyJobArgs{}, nil,
&riverworkflow.TaskOpts{Deps: []string{taskA.Name}})
// Fan-in to task c, which depends on both b1 and b2:
taskC := workflow.Add("c", MyJobArgs{}, nil,
&riverworkflow.TaskOpts{Deps: []string{taskB1.Name, taskB2.Name}})

See the workflows documentation for a full example.

River UI support

As much as Go developers love databases and command lines, graphical UIs are a huge timesaver for fast reference and operational work. River UI now includes support for workflows to help fully visualize exactly what they’re doing and to let you quickly check the details of each task.

Workflow detail page

I don’t think screenshots do it justice, so I encourage you to try it out yourself with the live demo.

River Pro

Workflows are part of River Pro, a new paid subscription offering aimed at making the long-term maintenance of the River project financially sustainable. The core River open-source project will always be free. It doesn’t contain any paid code, and many users will never need any either. The core feature set is already very complete.

At the same time, powerful new features like workflows take a lot of time to build, test, and maintain. These features also save a lot of expensive developer time for those who need them. As more companies adopt River as an essential part of their business, we want to assure them that the library will continue to be maintained and improved.

Pro features will be targeted mainly at businesses with more complex requirements that need sophisticated tools. Additionally, River Pro offers commercial support directly from River’s developers.

Projects like Sidekiq and Oban have provided tremendous inspiration for River Pro. They have shown how this model can bring amazing open-source software to the community while providing a sustainable revenue source. This ensures their authors can continue investing their time in building and maintaining their offerings over the years. I hope that River Pro can demonstrate the viability of this model in the Go ecosystem as well.

Subscription Go modules

Many language ecosystems offer the ability to add packages from a private authenticated server such as Ruby or Elixir. In Go, private modules can be distributed with either directly from a version control system or via a private module proxy.

River Pro uses the latter option, and is distributed through a private module proxy that doesn’t serve any other public modules. To my knowledge River Pro is the first paid Go module offering distributed in this fashion (let me know if I’ve missed one!).

The road ahead

River Pro is in very early release, and you can expect its feature matrix to be thoroughly fleshed out over the coming months. A few ideas for what’s coming:

  • Concurrency control: Set the maximum concurrency for specific job queues or job kinds. Think errgroup for background jobs.

  • Job batching: Run a number of available jobs of the same kind in the same Work invocation for increased handling efficiency. The idea is to dramatically reduce database operations and overhead which can be a tremendous help in a busy database.

  • Encryption: If you’re following best practices there’s a good chance you’re encrypting at rest already, but in some cases it’s useful to add another layer of encryption. e.g. When storing sensitive values like API keys or passwords and trying to protect against a leaked database.

Anything else you’d love to see? Send me an email. And as usual, sign up for the mailing list for future updates.

Footnotes

  1. Nine months is especially notable on my personal timeline. My 2nd child is due any day now! 👶