Reactive Workflows in a Cloud-Native World

Matt Magaldi is a member of the APG Graduate Rotational Program on his fourth and last rotation on the APG Core Data Investor Platform team. This team is responsible for building a suite of tools to enable investors and clients to gain insights into BlackRock’s wealth of financial data. Despite being relatively new to the open-source community, Matt leads the Argo Events project.

 

Imagine you want to calculate the risk analytics for all US Corporate bonds as soon as their prices have been received from a vendor or written to a data store. Alternatively, imagine you have a machine learning job you want to run every time a new dataset is published to S3. Although both of these examples are very different, the underlying mechanics are the same: an event occurs, work proceeds. How, though, can we build a scalable solution to tie certain events to specific workflows? The answer is through a small, simple “device” whose purpose is to detect events or changes in its environment and send information to other devices, in essence: a sensor.

Enter Argo Events

Today, BlackRock is excited to announce the release of Argo-events (formally known as Axis), an open source container native event framework for Kubernetes. Argo-events makes it easy to react to a variety of signals and build event driven workflows. Argo-events is implemented as a Kubernetes Custom Resource that deploys lightweight pods to act as sensors for events. Users can build plugins and extend the signal interface to on-board their universe of event sources to the framework.

argo-events-architecture-diagram
Argo Events High-Level Architecture

 

The Road to Open Source

While BlackRock is currently only using Kubernetes in a production environment for its data science platform, we have adopted Kubernetes to support the future cloud-native strategy for Aladdin. Argo-events originated out of an initiative within BlackRock to build a platform level scheduler on Kubernetes. When we approached the problem, we asked ourselves: How can we automate cross-business workflows to improve the efficiency and resiliency of daily batch processes? We segmented this question into two functional paradigms.

The first is the ability to execute complex workflows. For this we researched open sourced platforms and found Argo, the container native workflow engine. Originally built for the purpose of Continuous Integration and Deployment, Argo is powerful enough to handle everything from building CI/CD pipelines to coordinating machine learning jobs on GPUs. When we first started experimenting with Argo, we realized the potential it had to transform our daily batch process orchestration paradigm, which is mission-critical to meeting client SLAs. We preferred Argo over other similar platforms because it’s built as Kubernetes-native which reduces complexity and it has strong support from a growing community of users.

 

example-argo-workflow
A sample Argo workflow

The second paradigm is the ability to react to events. We started researching FaaS platforms like Amazon Lambda and Kubeless. Despite being a good starting point, these fell short of our need to define a full set of dependencies and take into account not only the events themselves, but also the context in which they occur. In a complex organization with a variety of teams, we knew from experience that most business processes have multiple upstream dependencies and those dependencies are managed by completely separate teams. We also wanted to be able to make internal event sources available as triggers to a workflow.

Contributing Back

After composing our findings, we decided the best path forward would be to leverage Argo as a workflow execution engine and build a flexible dependency management tool. One could combine the workflow engine directly with a FaaS platform, define custom polling scripts to establish event constraints and leverage the Kubernetes API to create resources. However, this approach falls short in a number of critical areas:

  1. Composing multiple signals together to form non-trivial boolean operators requires learning different APIs and manually linking them together through another program.
  2. Stateless dependency management becomes a difficult problem.
  3. Inability to react to the Nil case whereby an expected event does not occur.

Over the course of this project, the more we grew to understand its role internally, the more we saw its applicability for other open source projects. Our reasoning for open sourcing this project is twofold. First, while the Kubernetes ecosystem is maturing into an enterprise platform, we strongly believe Argo-events creates a niche for event-driven, container-native processes. Second, Argo-events presents an opportunity for many existing tools like Argo to gain value from an event-based dependency platform.

Conclusion

Argo-events is our first step towards making Kubernetes more reactive and easy to use for the broader community. We are excited to see how the project evolves and are grateful for the Argo community who have been extremely receptive and have agreed to take on the project to live alongside Argo. We invite you to give Argo-events a try and welcome your feedback.

Argo-events is available and hosted on GitHub here: https://github.com/argoproj/argo-events

Matt Magaldi

Matt Magaldi is a member of the APG Graduate Rotational Program on his fourth and last rotation on the APG Core Data Investor Platform team. This team is responsible for building a suite of tools to enable investors and clients to gain insights into BlackRock's wealth of financial data. Despite being relatively new to the open-source community, Matt leads the Argo Events project.