The Distributed Computing Update

One of the applications of cryptocurrency we continue to be excited about is distributed computing.

Before crypto, my laptop couldn’t pay a stranger’s idle server as a thank you for running a machine learning program. Cryptocurrencies finally give us the ability to make machine-to-machine payments to compensate participating nodes for running tasks.

In June I wrote an overview of the types of compute projects we were seeing. It’s been two months, but this is a space that moves fast and I wanted to keep sharing what we’ve been learning. Here goes.

Siloed networks vs an open protocol

There are two ways distributed compute could play out. In one model of the world, there is a dominant distributed compute protocol that creates a shared network of machines that anyone can build interfaces and clients on top of. Think of this like Heroku and EC2: both of them run on AWS servers, but they offer interfaces with drastically different experiences that cater to different audiences.

In the other model of the world, there are a few dominant compute projects that each have their own network of machines.

Both worlds allow for there to be coexisting projects that serve different audiences, but in one version of the world, the projects are clients on top of the same shared resource pool, and in the other, they all run their own independent networks. It is possible that these two models co-exist, but I think that is unlikely because of network effects. If given the opportunity, projects may opt to plug into an existing network of machines rather than build their own because having access to more CPU gives them better quality of service for their customers on day one than if they had to start from scratch.

We are seeing attempts at both. SONM is one project trying to build the shared resource layer. Another is Distributed Compute Protocol (DCP), built by Distributed Compute Labs. Most other projects are currently building out their own networks, though with open protocols there is really nothing stopping anyone from building alternative interfaces to any of these projects. We may see projects start as their own system and organically grow to be just one of the clients on top of their now shared resource layer. I am pretty excited about the possibility of a shared compute layer and about the teams and projects that are trying to build it. 

Typically the opinion of clients built on top of an open protocol is that they are brutally competitive because it is easy for a user to move from one client to another. Think of this like early Twitter where lots of people were building Twitter clients. It was easy for a user to move from one Twitter client to another so being in the Twitter client business was hard and competitive. This may be different with computing where the interface that the client exposes is the product. If it turns out to be the case that the client APIs are different from one another, because developers integrate them into their source code and CI/CD workflows, the clients could be incredibly sticky even if they all effectively expose the same backend. I think that is an important feature of compute that will even further incentivize projects to contribute to and build on a shared resource pool.

Token Questions

One question we have been thinking about is which tokens will be used by developers versus which tokens will be used by end users. That is: if a user interacts with a dapp that runs code on a distributed compute network, does the user pay the dapp in the same token that the dapp pays the compute service?

Right now the trending answer in compute services is no. Akash, Render, Perlin, Enigma and SONM are some of the compute projects that have their own transactional token. This follows the same model as IPFS/Filecoin where users will presumably pay dapps in whatever the major consumer-facing currency is (right now it is seemingly ETH or BTC) and dapps will behind the scenes exchange that token for the tokens they need to provide the service.

Hypernet and Truebit, on the other hand, are two compute projects with two-token models. In Truebit, for example, buyers can pay for the service in ETH, and the Truebit TRU token is just used for the protocol-specific functions of staking and dispute resolution. This matches a pattern we are seeing this year with infrastructure projects like The Graph and Augur that use the main consumer currency for transactions, and their own token only for governance, staking and dispute resolution. I predict we will see more projects change to the two-token model because it allows the price of governance to increase as the network grows, but doesn’t increase the price of the service with it.

The EC2 Model vs. The Lambda Model

In the existing web2 world, there are two main types of compute services: in the EC2 model, developers are provided an environment to run and host services, and in the Lambda model, developers write functions that can be invoked on demand.

The distributed computing projects break out into these two categories as well: one is like Lambda (or like Cloudflare Workers ) , the user writes a script, and the project runs it on participating machines. The other approach is the EC2 approach or the “someone else’s computer” approach: the user gets matched with someone on the network and can run a container on that someone’s machine.

Note that the Lambda approach isn’t quite Lambda yet – machines in Lambda-like distributed networks don’t store all of the functions ever pushed to them and invoke them on demand. Instead, these networks are for running offline and async scripts for use cases such as scientific computation or rendering graphics. As latency improves, we can see these becoming more like serverless compute over time.

The ecosystem needs both models: hosting a dapp front end requires a persistent host, and running one-off computations is better on a serverless-like platform.

Two projects working on hosting platforms are Akash and DADI. Akash actually looks very much like traditional compute services from the end user’s point of view – developers manage containers on Akash-deployed machines in a Kubernetes cluster that can be federated across machines on the Akash network. (Not coincidentally, Akash is founded by Greg Osuri who is also a contributor to Federated Kubernetes). If you’re curious to try Akash, they recently launched a testnet.

Two projects working on the serverless platforms are Ankr and DCP.

Oh, the devices you’ll go!

The thing that distributed serverless compute projects can do that feels unique to cryptocurrency-based distributed computing networks is that they can run code on strangers’ phones and laptops because they don’t need to persist the compute environment beyond running one small script at a time.

The idea here is that these projects can pool together all of the unused end user CPU to form a giant super computer that is cheaper than what is available on the cloud compute market today.

[Side tangent on pricing: The main argument here is that distributed networks will be cheaper because they do not have to pay for physical space and the hardware capex cost has already been committed. However, as Mario @ Placeholder pointed out to me, cloud compute pricing is already racing to the bottom and if distributed services come along and undercut the main players, cloud providers can presumably come all the way down to just above maintenance cost and stay competitive.]

I am very excited about projects that can provide access to high power compute environments by pooling together available CPU on end-user devices.

There are three big challenges with running code on end-user devices. The first is convincing enough individuals to participate, which we covered in our previous post.

The second is that end-user devices are relatively low power. To counter this, we are seeing projects building in parallelization to run code simultaneously across multiple machines at once. Ankr leaves it up to the user to package their code into chunks and submit them separately to the network, where they will be assigned by a job scheduler to different machines. DCP auto-magically distributes an application’s subtasks across machines in the form of JavaScript objects that execute in web workers (DCP also cleverly uses WebGL to tap into the GPUs on end-user devices for an additional boost).

The third challenge is that end-user devices are untrusted. There has been a lot of recent momentum in utilizing SGX, a trusted hardware environment built into Intel chips, even since our last post in June. Since then, Enigma released a testnet utilizing SGX for compute, Golem released Graphene-ng to help developers write SGX-enabled code, and Oasis Labs raised $45M led by a16z’s new crypto fund to build SGX-enabled distributed compute. The top 3 laptop makers: HP, Lenovo and Dell support SGX. MacBooks have SGX-enabled chips, but the BIOS hasn’t been configured to expose that functionality up to the operating system. When Apple adds SGX, the top 4 global laptop brands will all have built in support for SGX-enabled computing. I am a big supporter of the SGX approach because it is fairly secure and accessible on consumer laptops.

Besides SGX, another way distributed compute protocols can verify computations is with dispute resolution. Truebit is one of the compute projects with a dispute resolution protocol, which they call a “verification game”. In it, a verifier stakes TRU tokens to challenge the result of a computation. In Truebit’s dispute resolution game, the solver’s state is hashed at each time step of running the program (actually, any given instruction might not be executable within Ethereums’s gas limit, so TrueBit breaks down each instruction into 12 substeps). Then the verifier queries those hashed states using binary search to find the exact instruction at which things went awry. The disputed step or substep is then run on Ethereum for the final outcome. Whichever side is wrong loses their staked tokens, which are paid out to the winning side.

Where on the stack does compute fit?

One open question is whether compute services will end up being a layer 1 or layer 2 solution. That is: will the next major blockchain include compute as a built-in service, or will compute always be run off-chain.

The reason why compute is done off-chain now is because the predominant blockchains available for use are either Bitcoin: limited scripting language or Ethereum: compute is expensive and slow. There could very well be a future in which a layer 1 blockchain is able to bake in compute in a way that doesn’t require every node in the network to run the same computations, which would make it cheaper and faster. Perlin is one project attempting to build this, though even in Perlin, compute services are implemented as a side chain of the main Perlin base chain.

Most projects are either building side chains to existing blockchains, or completely off-chain networks that are independent from existing base chains. Render is one example of the first approach – Render is implemented as an Ethereum smart contract that interfaces with the Render network. Akash is an example of the latter: it is a separate network entirely.

I tend to like light, horizontal protocols that can be layered on one another rather than a super-protocol blockchain that can do everything. That is how the internet works now – small protocols that layer on top of one another (SMTP > STARTTLS > TCP > IP). What it allows for is reusable modules (both QUIC and DNS can use UDP without there needing to be changes to UDP to support that) and the ability to easily swap out and upgrade layers (HTTP can be swapped with SPDY or upgraded from HTTP 1.1 to HTTP 2.0 without making changes to the layers below it).

Geographical Market

The last thing to say here is that one potentially very smart thing we are seeing is projects focusing their approach on one geographical market. DCP, for example, is starting by providing compute to Canadian universities and labs (though through the process they have picked up a lot of interest from outside of Canada as well). Ankr, for example, is putting extra effort on reaching the Chinese computing market where demand for compute is skyrocketing (Aliyun’s revenue grew 104% year over year) and AWS doesn’t have too much of a stronghold (though Aliyun does). We think these targeted approaches could play out well.

Conclusion

It is still early days and there are a lot of unknowns, but we are optimistic about what could be. If you are building interesting projects in this space, we’d love to hear from you. Reach out: I’m [email protected].

Recommended in Open, Decentralized Data