Compute API - Description

The Compute API is a library for working with DCP, the Distributed Compute Protocol, to perform arbitrary computations on the Distributed Computer.

This is a terse API focused initially on the Scientific Computing audience, which exposes DCP capabilities in a way that allows developers to easily build applications for DCP which are forward-compatible with future versions of the Compute API and changes/optimiziations/etc to the underlying DCP.

Record of Issue

Date Author Change
Nov 09 2018 Wes Garland Initial Release, tracking spec Oct 31 2018

Intended Audience

General Public, unrestricted

Overview

This API focuses on generators, both ad-hoc and from published applications, built around some kind of iteration over a common function, and events. It is implemented in the compute.js module of the dcp package.

Definitions

API

Compute module

The compute module is the holding the module for classes and configuration options (especially default options) related to this API.

Generators

Most computations on the Distributed Computer operate by mapping an input set to an output set by applying a function to each element in the input set. Input sets can be arbitrary collections of data, but are frequently easily-described number ranges or distributions.

Generators associate functions with input sets, and enable their mapping and distribution on the Distributed Computer. We can create ad-hoc generators with the compute.run and compute.for APIs, or we can create generators using functions defined in applications via the Application.prototype.run and Application.prototype.for methods.

compute.run

This function returns a generator handle (an object which corresponds to a generator), and accepts one or two arguments, depending on form.

note - When work is a function, it is turned into a string with Function.prototype.toString before being transmitted to the scheduler. This means that work cannot close over local variables, as these local variables will not be defined in the miner’s worker thread. When work is a string, it is evaluated in the worker thread, and is expected to evaluate to a single function.

compute.for

This function returns an object which corresponds to a generator, and accepts two or more arguments, depending on form. The final argument, work, is scheduled for execution with one slice of the input set, for each element in the set. It is expected that work could be executed multiple times in the same worker thread, so care should be taken not to write functions which depend on uninitialized global state and so on.

Every form of this function returns a generator which, when executed, causes work to run n times and resolve the returned promise with an array of values returned by work, indexed by slice number (position within the set).

When the input set is composed of unique primitive values, the array which resolves the promise will also have an own property entries method which returns an array, indexed by slice number, which contains a {key: value} object, where key is in the input to work, and value is the return value of work for that input. This array will be compatible with functions accepting the output of Object.entries() as their input.

The for method executes a function, work, in the worker by iterating over an n-dimensional series of values. Each iteration is run as a separate slice, and each receives a single (n-dimensional) value. This is an overloaded function, accepting iteration information in a variety of ways. When work returns, the return value is treated as result, which is eventually used as part of the array or object which resolves the returned promise.

note - When work is a function, it is turned into a string with Function.prototype.toString before being transmitted to the scheduler. This means that work cannot close over local variables, as these local variables will not be defined in the miner’s worker thread. When work is a string, it is evaluated in the worker thread, and is expected to evaluate to a single function.

The promise is resolved following the same rules as in form 1, except the arrays/objects nest with each range object. (See examples for more clarity)

Output Sets

Every form of the for() function maps an input set onto an output set by means of a work function.

The promise is resolved with an Array-like object which represents this mapping, indexed by slice number. Additional, non-enumerable methods will be available on this object to make marrying the two sets together more straightforward. These methods are are based on methods of Object.

Range Objects

Range objects are vanilla ES objects used to describe value range sets for use by compute.for(). Calculations made to derive the set of numbers in a range are carried out with BigNumber, eg. arbitrary-precision, support. The numbers Infinity and -Infinity are not supported, and we do not differentiate between +0 and -0.

Describing value range sets, rather than simply enumerating ranges, is important because we need to be able to eventually schedule very large sets without the overhead of transmitting them to the scheduler, storing them, and so on.

Range Objects are plain JavaScript objects with the following properties:

When end - start is not an exact multiple of step, the generator will behave as though end were the nearest number in the range which is an even multiple of step, offset by start. For example, the highest number generated in the range object {start: 0, end: 1000, step: 3} would be 999.

Distribution Objects

Distribution objects are used with compute.for, much like range objects. They are created by methods of the set exports of the stats module, and are used to describes input sets which follow common distributions used in the field of statistics. The following methods are exported:

let stats = require('stats')
let g = compute.for(stats.set.poisson(10, 0.1, 100), (i) => i)

Generator Handles

Generator handles are objects which correspond to generators. They are created by some exports of the compute module, such as compute.run and compute.for.

Properties

Present once generator has been deployed

Methods

Events

The generator handle is an EventEmitter (see EventEmitters, below), with the following events defined:

Worker Environment

Work functions (i.e. the final argument to compute.for() are generally executed in worker threads inside miners. These are the functions which map the input set to the output set.

Each work function receives as its input one element in the input set. Multi-dimensional elements, such as those defined in compute.for() form 3, will be passed as multiple arguments to the function. The function returns the corresponding output set element, and must emit progress events.

The execution environment is based on CommonJS, providing access to the familiar require() function, user-defined modules, and modules in packages deployed on Distributed Compute Labs’ module server.

Global Symbols

Applications

Applications are essentially groups of work functions that live on the scheduler which, when combined with an input set, produce generators. Running generators derived from applications also unlocks extra functionality within the miners, such as the ability to have categories (‘hashTags’) which allow miners to choose which types of work loads they are interested in mining. Each application has one or more functions, which are referenced symbolically instead of being explicitly specified in the generator.

Launching an application on the network is a two-step process. The developer creates and names the application and its functions and submits it to the Distributed Computer. Once the application has been reviewed by Distributed Compute Labs’ staff, it will be available for clients to create generators. Applications can only be updated by publishing new versions, and only submission requests signed by the original submitter’s key will be accepted.

Application Handles

Application handles are objects which correspond to applications. They are created by instantiating the compute.Application class.

Constructor

The Application constructor is an overloaded object which is used for defining/publishing applications and referencing applications on the scheduler. An instance of Application which is used for define/publish can also be used to reference that application, but that will generally not happen in the real world, as the two events will be disconnected by significant amounts of time – assuming the application is even approved in the first place.

new Application() - form 1 - preparing to publish

This form of the constructor is used for creating/publishing applications. It accepts three arguments: applicationName, version, and publishWallet.

new Application() - form 2 - published application

This form of the constructor is used to access functions which have already been published. It accepts two arguments: applicationName, and version.

Methods

For new applications
For deployed applications

Properties

These properties are optional, public, and probably displayed or used during mining:

Cost Profiles

Cost profiles are used to describe the fee structure that the user is willing to pay for a specific generator’s output. We define three fixed value profiles for use by DCP users; other profiles can be specified as cost profile objects. The fixed value profiles are

Possible Future Growth - it might be interesting to att compute.dynamicMarketValue at some point. This would cause the scheduler to re-compute the market value for that generator every few slices.

Cost profile objects have the following properties:

Using a cost profile object with both maxPaymentPerSlice and maxTotalPayment undefined is an error. Interfaces can treat this condition the same as ENOFUNDS, since we know there are no bank accounts with infinite funds.

Any interface which accepts a cost profile object (e.g. exec()) must also handle literal numbers, instances of Number, and BigNums. When a number is used, it is equivalent to an object which specifies maxTotalPayment. i.e., .exec(123) is the same as .exec({maxTotalPayment: 123})

Requirements Objects

Requirements objects are used to inform the scheduler about specific execution requirements, which are in turn used as part of the capabilities exchange portion of the scheduler-to-miner interaction.

This object will grow over time, and needs to be organized in a way that we never need to move properties; this means that good top-level namespacing will be critical.

let requirements = {
  machine: {
    gpu: true
  },
  engine: {
    es7: true,
    spidermonkey: true
  }
}

Boolean requirements are interpreted as such:

In the example above, only workers with a GPU, running ES7 on SpiderMonkey would match. In the example below, any worker which can interpret ES7 but is not SpiderMonkey will match:

let requirements = {
  engine: {
    es7: true,
    spidermonkey: false
  }
}

EventEmitters

All EventEmitters defined in this API will be bound (i.e. have this set) to the relevant generator when the event handler is invoked, unless the event handler has previously been bound to something else with bind or an arrow function.

The EventEmitters will have the following methods:

Default Values

The compute module will export an object named default, which is used to specify default values. This object is always exported, even if it has no properties. The following properties are supported:

Modules

The work specified by the generator handle exec and application handle publish methods can depend on modules being available in the worker thread. This will be handled by automatically publishing all of the modules which are listed as relative dependencies of the generator. We can assume that dependencies loaded from the require.path are part of pre-published packages.

Modules in Clients

Keystores

Unlocking keystores is handled by the Protocol API, in protocol.unlock(keystore, passphrase). When the passphrase is undefined or incorrect, the protocol will solicit a passphrase from the user. In the browser, this is done via the keychain API; on Node applications, this is done by prompting for a password from the console in a blocking read with stty echo off. If the Node application detects that the console is not a terminal (see isatty(3)), it will throw an exception instead of prompting for a password.

Example Programs

1. compute.for() form 2b

const compute = new require('dcp/compute')
const protocol = new require('dcp/protocol')
const paymentWallet = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for(1, 3, function (i) {
  progress('100%')
  return i*10
})
g.setPrice(compute.safeMarketPrice)
g.setPaymentWallet(paymentWallet)
let results = await g.exec()
console.log('results:    ', results)
console.log('entries:    ', results.entries())
console.log('fromEntries:', results.fromEntries())
console.log('keys:       ', results.keys())
console.log('values:     ', results.values())
console.log('key(2):     ', results.key(2))

Output:

results:     [ 10, 20, 30 ]
entries:     [ [ '1', 10 ], [ '2', 20 ], [ '3', 30 ] ]
fromEntries: { '1': 10, '2': 20, '3': 30 }
keys:        [ '1', '2', '3' ]
values:      [ 10, 20, 30 ]
key(2):      20

2. compute.for() form 1, step overflow

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for({start: 10, end: 13, step: 2}, (i) => progress(1) && i))
let results = await g.exec(compute.marketValue, paymentAccount)
console.log(results)

Output: [ 10, 12 ]

3. compute.for() form 1 with group

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for({start: 10, end: 13, group: 2}, (iA) => progress(1) && i[1]-i[0]))
let results = await g.exec(compute.marketValue, paymentAccount)
console.log(results)

Output: [ 1, 1 ]

4. compute.for() form 3

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for([{start: 1, end: 2}, {start: 3, end: 5}], (i,j) => (progress(1), i*j))
let results = await g.exec(compute.marketValue, paymentAccount)
console.log(results)

Output: [[3, 4, 5], [6, 8, 10]]

5. compute.for(), form 3

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for([{start: 1, end: 2000}, {start: 3, end: 5}], function(i,j) {
  let best=Infinity;
  for (let x=0; x < i; x++) {
     best = Math.min(best, require('./refiner').refine(x, j))
     progress(x/i)
  }
  return best
})
let results = await g.exec(compute.marketValue, paymentAccount)
console.log(results)

Output: [[[1, 3, 3], [1, 4, 4], [1, 5, 5]], [[2, 3, 6], [2, 4, 8], [2, 5, 10]]]

6. compute.for() form 4, iterable object (Array)

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let g = compute.for([123,456], function(i) { 
  progress(1)
  return i
})
let results = await g.exec(compute.marketValue, paymentAccount)
console.log(results)

Output: [ 123, 456 ]

7. Minimal Code to Run 0-Cost Generator

const paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
let results = await compute
              .for([123,456], (i) => progress(1) && i/10))
              .exec(0, paymentAccount)
console.log(results)

Output: [ 12.3, 45.6 ]

8. compute.for(), form 4, using ES6 function* generator

function* fruitList() {
  yield "banana"
  yield "orange"
  yield "apple"
}

let g = compute.for(fruitList(), (fruit) => progress(1) && fruit + 's are yummy!')
g.requirements = { compute: { gpu: true } }
g.paymentAccount = protocol.unlock(fs.openFileSync('myKey.keystore'))
results = await g.exec()
console.log(results.join('\n'))

Output:

bananas are yummy!
oranges are yummy!
apples are yummy!

9. Resuming a Generator

const fs = require('fs')
const process = require('process')
let g = compute.for(0, 1000, (i) => progress(1) && i)
g.on("accepted", function (ev) {
  fs.writeFileSync('gId.txt', this.id, 'ascii')
  process.exit(0)
})
results = await g.exec(compute.marketValue, protocol.unlock(fs.openFileSync('myKey.keystore')))
console.log(results.reduce((a, val) => a + val)) /* probably not reached */

(new program - gets run later)

let gId = fs.readFileSync('gId.txt', 'ascii')
let g = new compute.resume(gId)
results = await g.exec()
console.log(results.reduce((a, val) => a + val))

Output: 499500

10. Altering Price Parameters

let g = compute.for(2, 1000, (i) => i)
setInterval(function() {
  g.exec(compute.safeMarketValue) /* update bid price every five minutes */
}, 5 * 60000)
results = await g.exec(compute.safeMarketValue, protocol.unlock(fs.openFileSync('myKey.keystore')))
console.log(results.reduce((a, val) => a + val)) /* probably not reached */
Output: ``499499``

11. Publish Application

let app = new compute.Application("videoProcessor", "1.0.0", identificationWallet) 
app.requirePath.push("core")
app.defineFunction("enhance", ["./ffmpeg", "core/serializer"], enhanceFunction)
app.defineFunction("vingette", ["./ffmpeg", "core/serializer"], vingetteFunction)
let stabilize = app.defineFunction("stabilize", ["./ffmpeg", "core/serializer"], stabilizeFunction)
stabilize.requires({machine: {gpu: true})
let appRequestId = await app.publish()

12. Use function in Application

let vp = new compute.Application("videoProcessor", "^1.0.0")
let g = vp.for(frames, "stabilize")
let results = await g.exec(compute.marketPrice, paymentWallet)