SPLAY manual

Introduction

SPLAY provides both a deployment system, executive environment and libraries to easily develop distributed applications. One can use SPLAY applications locally, without using the deployment system but still using SPLAY libraries.

This manual presents the basics for using SPLAY language and libraries. For any questions, comments or suggestions of improvements please contact us.

Lua hasn't threads but a coroutine system. To get a mostly transparent thread system, we added call to the scheduler when waiting on IO network operations.

Lua basics

We strongly recommend users to consult the Lua manual, nonetheless, this part go over the most important Lua specificities.

Tables

In Lua, tables are used both for numerically indexed arrays and hashindered maps (and can be a mix of both).

Some functions rely about a numerical indexing and other doesn't.
For example, the size operator (#) will only count the size from index 1 (numeric) to n (numeric). If there is an hole (a missing numerical index) in the increment, the size reported will be the size before the hole.

Lua function table.insert(), table.remove(), ... manipulate numerically indexed table without creating any holes, so if you only use that function you will always have a correct size reported.

Two syntaxes exist for accessing a table by a key.

Tables can contain everything, including functions.

Note that Table can be used for using an Object-oriented programming style. But one has to recall that Lua is closer to prototype programming than to a classes and objects-based system. When one calls some table function with ":" instead of "." the first parameter given to the function will be the table itself.

For advanced OOP/prototype style programming and heritance, meta-tables are needed (see Lua manual).

Multiple return

Functions as first class object

Local execution VS Deployement over a testbed

This section describes how to code applications that will run without modifications both locally and in the deployment execution environment.

The first important information relates to: if, and how the application is running under a deployment. If it is deployed, a new environment variable is available: "job". "job" is a table and contains:

For more informations about what is a "node", consult API section.

Just by checking the presence of the "job" variable, an instance of the application can know if it runs locally or under deployment. In the former case, one has probably launched application directly within the Lua interpreter. In that case, one can use additionnal command line arguments.

You can too specialize the behavior of your application using the job informations. For example if the job instance receives an "head" list type, the rendez-vous node can be the first in the list. If the list type is "random", the rendez-vous node can be an external hard coded node.

Given these informations, it's very practical to have only one source code used for all the tests in every possible deployment (and local) parameters.

Also see splay.utils.generate_job() to auto generate a fake "job" variable for local testing.

SPLAY API

Nodes

As a convention, a node is an Lua table with two keys: "ip" and "port".

All network-related functions that require an ip and port (sometime only port) parameters will directly accept a "node" (including LuaSocket when used through SPLAY).

So in most cases where you need (..., ip, port, ...) parameters, you can replace them (if you want) by (..., node, ...).

Upon deployment, a SPLAY application receives the list of other peers in "node" format. In most applications, ip and port can be completly hidden using "nodes".

splay.base

splay.base contains the minimum to have a SPLAY application. Each SPLAY application begin with:

After requiring splay.base, some global variables will be set:

One can use them directly without having to "require" them again (but it doesn't hurt).

splay.events

The event system is the core of the SPLAY runtime environment. It is used by the thread system, network IO, ... It basically acts as a global scheduler.

loop([func]) Start the main application loop.

If you pass an argument, it will be forwarded to events.thread(). By convention, the (anonymous) function given to loop() will be considered has the "main" of the application.

You can pass the same parameters as thread().
thread(func) Add a new thread. The function given as a parameter will be run as a new thread. For functions with arguments, one needs to wrap it into an anonymous function. This function can also take an array of functions as parameter.

This function return a thread object (coroutine).
periodic(func, time[, force]) Periodically (each "time" seconds) call function "func". This function will only be called if the previous call has ended (setting "force" to true, allows to bypass this).
dead(thread) Check if a thread is dead.
sleep(time) Sleep "time" seconds.
fire(name[, arg]*) Fire a new event called "name" with optionnal arguments.
wait(name[, timeout]) Wait for an event named "name".

  • If not timeout value:
    Return value returned by fire (or nil if not).
  • If timeout value:
    • If wait has not timeouted:
      Return true, value(s) returned by fire (or nil if not).
    • If wait has timeouted:
      Return false, "timeout"
yield() Lua has no "true" threading support but uses coroutine (cooperative) approach. To support SPLAY threads, we have wrapped network IO calling the events scheduler. Sometimes a function has to do a long computation without doing any IO. In that case, it is recommended to split the work of that function and regulary call yield() to give a chance to other threads to run.

Always return true.
lock() Return a new lock object.
semaphore(size) Like a lock, but "size" threads can be granted the right to pass through.
synchronize(func[, timeout]) Permits to synchronize the access to some function "func".
stats() Returns a string containing some stats about the scheduler.

splay.rpc and splay.urpc

RPC (Remote Procedure Call) allows to easily call a function on a remote host and locally get its result. RPC are often used in SPLAY applications as they permit a very clean, concise and readable code.

SPLAY provides two types of RPC: using TCP (called RPC) and using UDP (called URPC).

Both RPC system can be used at the same time (and with same port).

TCP RPC characteristics:

UDP RPC caracteristics:

The API is the same for both types of RPC:

You can not only call functions but get the value of a variable, a variable in a table, ... You can access remotly everything except function pointers.

When doing function calls, the separate syntax, ie. a table with function name followed by each arguments is more efficient (less overhead on the remote node, faster calls), specially when arguments have a big size.

server(port [, max]) Run a new RPC server on port "port". "max" (only for TCP) RPCs (sockets) at the same time (default = unlimited).
a_call(ip, port, func_a, timeout) "func_a" is an array that contains (1) the name of the function to be called (the first parameter) and (2) parameters for the function being called (subsequent parameters).

If the RPC is successful, returns:
true, numerical array with function return values

If the RPC has failed, return:
nil, error message
call(ip, port, func_a, timeout) Same as a_call() but directly returns the function call replies (like if you call a local function). The problem is that it can be difficult or impossible to distinguish between a function replying an error and a transmission problem.

To use only in non faulty environments or if one can easily detect the error with reply.
ping(ip, port[, timeout]) RPC based ping. Returns true if the ping succeeded, false otherwise.
proxy(ip, port) Create a new rpc proxy for one host. Calls are expressed in a still more natural manner.

splay.net

Collections of methods to ease usage of TCP and UDP.

server(func, port[, max]) Start a new TCP server listening on port "port". Each time a new client connects, the function handler will be called with the socket in parameter. "max" connections (sockets) at the same time (default = unlimited)

A call to socket close() will be done when the handler exit.
udp_helper(func, port) Return a new "UDP object" and start a server listening on port "port" and calling the function "func" when we receive and incomming UDP packet.

If handler value is nil, for each packet a new event is fired with name "udp:".

This function permits to reuse to UDP socket used for the server.

splay.log

Log system will permit to log messages of different levels through differents outputs.

Levels:

new(level, prefix) Create a new log object that will log with level 'level' and append a 'prefix' before output.
level [variable] From 1 (debug) to 5 (print), permits to choose the minimum log level.
log:write(level, msg)

log:debug(msg)
log:notice(msg)
log:warn(msg)
log:error(msg)
log:print(msg)
All except write() are aliases to the write() function, logging with the level corresponding to their names.

splay.misc

Container for various functions.

dup(e) Recursivly dupplicates an element (generally a table).

split(s, sep) Split a string 's' with 'sep'. Return a table.
size(t) Return the size of a table, count all elements (see isize() for example).
isize(t) Return the size of a table watching only the highest numerical index.
random_pick(t, n) Pick 'n' elements from a table 't'. Return a new table with picked elements.
random_pick_one(t) Pick one element from a table 't' and return it.
time() Return unixtime (with milliseconds precision).
between_c(i, a, b) Circular between. Return true if 'i' is between 'a' and 'b'. Useful for DHT overlay.
hash_ascii_to_byte(s) Transform an hexadecimal ASCII string in bytes (2 times shorter).
gen_string(base, mult) Generates a string resulting of 'mult' times 'base' concatenations.
table_concat(t1, t2) Concatenate two tables into one.

splay.llenc

Block encoding for network transfers.

wrap(socket) After wrapping, the socket can send and receive full blocks of any kind of data.
socket:send(block) Send a block (can be a numerical array of blocks, in that case multiple send will be done).
socket:receive(max_length) Receive a block or nil, "error" if too long.
socket:receive_array(number, max_length) Same as receive() but wait "number" blocks and reply a table containing them.

splay.json

Data encoding for network transfers. Used by various libraries like Log and RPCs.

wrap(socket) After wrapping, the socket can send and receive Lua complex structures (like tables). This wrapper will too use LLenc wraper.
socket:send(data) Send a Lua data structure.
socket:receive() Receive a Lua data structure or nil, error.

splay.benc

Bittorrent encoding library.

endcode(data) Encode data with bencoding.
decode(data) Decode bencoded data.

splay.utils

Collections of methods for local testing.

generate_job(position, [nb_nodes, [first_port, [list_size, [random]]]]) Generate a fake job variable for local testing.
Then make a file "run.sh" with that content:

And make it executable (chmod 755 run.sh). Now you can use it to run your local SPLAY application (./run.sh).

splay.restricted_io

When deploying a SPLAY application, it will be sandboxed. This section will describe how sandboxing is done for the Lua IO library, particulary file system accesses.

Like restricted_socket do for LuaSocket, restricted_io will wrap original IO functions behind a security layer that will apply additionnal restrictions.

First the library need to be initialized via the init() function. With that function we will exactly set the restrictions we need and particulary give a path to a folder that will represent the root of the virtual restricted file system.

When you open a file in RIO (restricted IO), the file path will be mapped to a flat file using a MD5 hash. So all your virtual files will appear in the same directory with hash names. Then, additionnal restrictions will be applied like:

Total disk space used can be a little more than the total disk space because file system generally allocate blocks to store data. So, in the worst case, true size on disk will be: block_size * max_files + max_size.

init(settings) RIO will not work correctly if not initialized, so if you want to use it locally to test it, call this function first.

All Lua IO functions Lua IO API

LuaSocket

As previously said, we have kept the whole LuaSocket API intact adding restrictions layer.

We have too extended the syntax to accept nodes where possible.

But one must care to close() the socket after using it, because the close() functions will decrease the ressource counter (without doing that, the ressource will be freed by the garbage collector but the ressource counter will not be decreased).

Restrictions

When you deploy your SPLAY applications, they are run in a sandbox.

Here is the full list of removed functions:

They are disabled because they could introduce a security issue or permit to get informations about the host running SPLAYd.

All other Lua functions (and libraries) are available. Most IO functions have been wrapped to work in a secure virtual filesystem but the API is exactly the same.

If you really need to use one of these functions locally, you can as usual test for the function existence (or test if the 'job' variable is set) before calling it.

Extend SPLAY

SPLAY can be extended both by Lua modules (libraries) and/or C modules.

The additionnal module will be loaded as usual (see Lua documentation), but you need to edit jobd.lua and add its name to the white list of accepted modules.

Within official SPLAYd, we only include modules that don't present any security/ressources issues.