March 2024

Propeller

Open-source WebAssembly orchestrator for secure workload execution across the Cloud–Edge–IoT continuum. Worked across embedded Wasm execution, Zephyr RTOS integration, MQTT-based task delivery, Wasm component invocation, task scheduling, binary delivery, and runtime reliability—helping make Propeller capable of dispatching workloads from cloud servers down to ESP32-S3 microcontrollers running WAMR.

Why This Work Mattered

Most orchestration systems assume Linux, containers, stable networking, and enough memory to run heavyweight control planes.

That assumption breaks at the edge.

Real edge systems are full of devices that do not behave like servers—microcontrollers with unreliable connectivity, RTOS devices with strict memory limits, gateways with inconsistent uptime, and edge nodes where Kubernetes is more burden than solution.

You still need to deploy code there. You still need sandboxing. You still need secure execution. You still need trust.

Propeller exists to remove what I think of as the Linux tax.

The same workload should be deployable on a cloud VM or a constrained ESP32 running Zephyr without changing the execution model. A $4 microcontroller should be treated with the same architectural seriousness as a server.

That was the part of the system I worked on: making the execution layer real across both manager and proplet runtimes, especially where memory, networking, and runtime behavior are all hard constraints.

Systems I Changed

24 Concurrent WebAssembly Instances on a $4 Device

The strongest proof of the system was the embedded benchmark.

Using an ESP32-S3 running Zephyr RTOS and the WebAssembly Micro Runtime, I helped build and validate the embedded Proplet path capable of running 24 concurrent CPU-bound WebAssembly instances on a single microcontroller.

CPU-bound workloads scaled to 24 isolated instances with memory still available. Memory-heavy workloads hit hard limits much earlier, which made the benchmark useful beyond the headline because it mapped the real operating envelope.

That benchmark validated Propeller's central claim: the same orchestration layer can span from cloud infrastructure down to constrained RTOS devices without changing how workloads are dispatched or executed.

Portable execution only matters if it survives real hardware limits.

Embedded Reliability Across Unfriendly Systems

Running Wasm on a microcontroller is mostly not about Wasm.

It is about everything around it: Wi-Fi station mode, MQTT lifecycle ordering, reconnect behavior, socket polling, keepalive failures, stack sizes, memory fragmentation, discovery failures, result publishing, and WAMR compatibility.

A device disappears and the system usually tells you almost nothing.

I worked through failures across Wi-Fi provisioning, MQTT initialization order, reconnect handling, keepalive publishing, Last Will and Testament (LWT) behavior, failed proplet discovery, embedded credential management, stack sizing, and Zephyr kernel configuration.

The goal was simple: make failures boring.

Reliable systems are built when invisible failures become predictable and recoverable.

Extending the Wasm Runtime Beyond Simple Modules

Propeller needed more than basic Wasm module execution.

I extended the Rust Proplet runtime to support custom component exports, WAVE-encoded invocation arguments, string-based task inputs, component export selection, ELASTIC HAL interfaces, and attestation and TEE validation paths.

This moved the platform from "run a Wasm module" toward "run portable Wasm components that can interact with real hardware abstractions."

That matters because production orchestration is not about executing a hello-world module. It is about executing real workloads safely across incompatible systems.

Runtime interfaces are where portability becomes useful.

Task Scheduling That Actually Understands the Edge

I worked on task orchestration across scheduling, dispatch, and execution targeting.

This included scheduled cron tasks, task priority scheduling, broadcast execution, targeted proplet execution, proplet metadata and identity handling, liveness history, and cleaner failure behavior for invalid or missing proplet IDs.

Scheduling is not just about when something runs.

It is also about where it should run, who should execute it, whether it should fan out across multiple devices, and what should happen when the target disappears.

Good schedulers reduce operational surprise.

Bad ones create it.

Flexible Binary Delivery Instead of Forced Complexity

Wasm binaries are easy until they are not.

Inline payloads work for development. OCI registries work for production. But many teams already serve binaries over plain HTTP.

I added HTTP-based Wasm delivery so Proplets could fetch binaries directly from HTTP and HTTPS URLs instead of forcing every deployment through registry workflows.

That required new task model fields, PostgreSQL and SQLite storage changes, manager dispatch updates, safe streaming fetch limits, runtime routing logic, and delivery tests.

The runtime should not care where the bytes came from.

It should receive Wasm and execute it.

Good infrastructure should support the simplest safe path instead of forcing unnecessary complexity.

Runtime Trust and Confidential Execution

I also worked on the foundation for confidential edge computing.

By extending support for ELASTIC HAL interfaces and attestation paths, I helped move execution trust closer to something verifiable instead of assumed.

If workloads are going to run across shared or multi-tenant infrastructure, especially across the cloud-edge continuum, integrity matters.

You need to know that a workload executed where it was supposed to execute, under the guarantees it expected.

Attestation and TEE validation paths were the first step toward that.

This was less about security as a feature and more about trust as infrastructure.

Engineering Impact

Most of my work sat between runtime capability and constrained-device reliability—making sure workloads could actually run where the architecture claimed they could.

That meant proving cloud-to-microcontroller execution was not just possible, but operationally reliable.

Reliable orchestration is not just scheduling work.

It is making sure the task can be delivered, fetched, executed, observed, and trusted across environments that behave nothing alike.

The strongest engineering work was usually not adding features.

It was removing uncertainty.

Making devices visible.

Making delivery predictable.

Making execution trustworthy.

That is where infrastructure becomes real.

What Operating This Taught Me

Execution Is the Product

People think orchestration platforms are about scheduling.

They are not.

They are about trust.

Can I submit work and know where it ran?

Can I trust the result?

Can I debug failure without touching the device?

Can a constrained microcontroller behave like part of the same system as a cloud VM?

That is the real product.

Everything else is interface.

Reliability Starts Before Incidents

Most failures begin long before the outage.

Unsafe delivery paths, weak reconnect logic, poor observability, fragile migrations, bad rollback assumptions, and missing liveness signals create incidents long before production notices them.

Working on Propeller reinforced the same lesson repeatedly: reliability is usually decided before the incident exists.

The best systems do not eliminate failure.

They make failure predictable.

Platform Scale

Propeller is part of the EU Horizon Europe ELASTIC project and serves as orchestration infrastructure for secure Cloud–Edge–IoT execution across research and production environments.

The system is built around WebAssembly-native execution, MQTT-based distributed task dispatch, Zephyr RTOS and WAMR support, Wasmtime runtime execution, OCI and HTTP Wasm artifact delivery, secure sandboxed execution, and no dependency on Kubernetes or heavyweight control planes.

This was Day 0 engineering—building the execution model early enough that the rest of the platform could trust it.

Working on systems like this changes how you think.

You stop optimizing for features.

You optimize for trust.

How I Think

Reliable distributed systems come from three things.

Strong execution boundaries. Workloads must run safely and consistently across different hardware and runtime environments.

Safe delivery paths. The system must survive bad networks, constrained devices, and partial failure without losing trust.

Operational clarity. Debugging should be fast, boring, and predictable.

That is why I care equally about runtimes, schedulers, protocols, and observability.

They all solve the same problem: reducing the distance between submitted work and trusted execution.

Working on something in this space?

Get in touch →