SPLAY manual
- Lua basics
- Local execution VS Deployement over a testbed
- SPLAY API
- LuaSocket
- Restrictions
- Extend SPLAY
Introduction
SPLAY provides both a deployment system, executive environment and libraries to easily develop distributed applications. One can use SPLAY applications locally, without using the deployment system but still using SPLAY libraries.
This manual presents the basics for using SPLAY language and libraries. For any questions, comments or suggestions of improvements please contact us.
- First, get the latest SPLAYd and install it (read INSTALL in tarball).
- SPLAY uses Lua as a base language. Lua is a simple and easy to learn well documented language. We will introduce Lua syntax in this document, but we recommend you to take a look at Lua manual , particulary these sections:
- Read LuaSocket documentation (if you need to use TCP and UDP socket directly)
Lua hasn't threads but a coroutine system. To get a mostly transparent thread system, we added call to the scheduler when waiting on IO network operations.
Lua basics
We strongly recommend users to consult the Lua manual, nonetheless, this part go over the most important Lua specificities.
- Lua "natural" numerotation/indexes begins with 1 not 0.
- Not equal is ~=
- Dynamic typing.
- Functions can return more than one value.
- By convention, when there is an error, a function return nil, "error message".
- Table is the only complex stucture.
- Functions are first class objects. One can pass on a function to another function as a parameter, or redefine a function.
- If not declared "local", a variable is global.
- Variable number of arguments (advanced topic, see documentation).
- Meta-tables (advanced topic, see documentation).
- Upvalues and closures (advanced topic, see documentation).
Tables
In Lua, tables are used both for numerically indexed arrays and hashindered maps (and can be a mix of both).
Some functions rely about a numerical indexing and other
doesn't.
For example, the size operator (#) will only count the size from index 1
(numeric) to n (numeric). If there is an hole (a missing numerical
index) in the increment, the size reported will be the size before the hole.
Lua function table.insert(), table.remove(), ... manipulate numerically indexed table without creating any holes, so if you only use that function you will always have a correct size reported.
Two syntaxes exist for accessing a table by a key.
Tables can contain everything, including functions.
Note that Table can be used for using an Object-oriented programming style. But one has to recall that Lua is closer to prototype programming than to a classes and objects-based system. When one calls some table function with ":" instead of "." the first parameter given to the function will be the table itself.
For advanced OOP/prototype style programming and heritance, meta-tables are needed (see Lua manual).
Multiple return
Functions as first class object
Local execution VS Deployement over a testbed
This section describes how to code applications that will run without modifications both locally and in the deployment execution environment.
The first important information relates to: if, and how the application is running under a deployment. If it is deployed, a new environment variable is available: "job". "job" is a table and contains:
- job.me node actually running the instance of the application
- job.position node absolute position in the job list
- job.nodes list of nodes
job.me == job.nodes[job.position] only if type of list is "head" (not "random") and if list size >= job.position - job.list_type type of the list: "head" or "random"
For more informations about what is a "node", consult API section.
Just by checking the presence of the "job" variable, an instance of the application can know if it runs locally or under deployment. In the former case, one has probably launched application directly within the Lua interpreter. In that case, one can use additionnal command line arguments.
You can too specialize the behavior of your application using the job informations. For example if the job instance receives an "head" list type, the rendez-vous node can be the first in the list. If the list type is "random", the rendez-vous node can be an external hard coded node.
Given these informations, it's very practical to have only one source code used for all the tests in every possible deployment (and local) parameters.
Also see splay.utils.generate_job() to auto generate a fake "job" variable for local testing.
SPLAY API
- Nodes
- splay.base
- splay.events
- splay.rpc
- splay.net
- splay.log
- splay.misc
- splay.llenc
- splay.json
- splay.benc
- splay.utils
- splay.restricted_io
Nodes
As a convention, a node is an Lua table with two keys: "ip" and "port".
All network-related functions that require an ip and port (sometime only port) parameters will directly accept a "node" (including LuaSocket when used through SPLAY).
So in most cases where you need (..., ip, port, ...) parameters, you can replace them (if you want) by (..., node, ...).
Upon deployment, a SPLAY application receives the list of other peers in "node" format. In most applications, ip and port can be completly hidden using "nodes".
splay.base
splay.base contains the minimum to have a SPLAY application. Each SPLAY application begin with:
After requiring splay.base, some global variables will be set:
- events (splay.events)
- misc (splay.misc)
- log (splay.log)
- socket (splay.luasocket, splay.socket_events)
splay.events
The event system is the core of the SPLAY runtime environment. It is used by the thread system, network IO, ... It basically acts as a global scheduler.
| loop([func]) |
Start the main application loop. If you pass an argument, it will be forwarded to events.thread(). By convention, the (anonymous) function given to loop() will be considered has the "main" of the application. You can pass the same parameters as thread(). |
|---|---|
| thread(func) |
Add a new thread. The function given as a parameter will be run as a
new thread. For functions with arguments, one needs
to wrap it into an anonymous function. This function can also take an
array of functions as parameter. This function return a thread object (coroutine). |
| periodic(func, time[, force]) | Periodically (each "time" seconds) call function "func". This function will only be called if the previous call has ended (setting "force" to true, allows to bypass this). |
| dead(thread) | Check if a thread is dead. |
| sleep(time) | Sleep "time" seconds. |
| fire(name[, arg]*) | Fire a new event called "name" with optionnal arguments. |
| wait(name[, timeout]) |
Wait for an event named "name".
|
| yield() |
Lua has no "true" threading support but uses
coroutine (cooperative) approach. To support SPLAY threads, we have wrapped
network IO calling the events scheduler. Sometimes a function has to do
a long computation without doing any IO. In
that case, it is recommended to split the work of that
function and regulary call yield() to give a chance to other threads
to run. Always return true. |
| lock() | Return a new lock object. |
| semaphore(size) | Like a lock, but "size" threads can be granted the right to pass through. |
| synchronize(func[, timeout]) | Permits to synchronize the access to some function "func". |
| stats() | Returns a string containing some stats about the scheduler. |
splay.rpc and splay.urpc
RPC (Remote Procedure Call) allows to easily call a function on a remote host and locally get its result. RPC are often used in SPLAY applications as they permit a very clean, concise and readable code.
SPLAY provides two types of RPC: using TCP (called RPC) and using UDP (called URPC).
Both RPC system can be used at the same time (and with same port).
TCP RPC characteristics:
- Function can accept arguments of unlimited size and can receive replies of unlimited size.
- Establishing a connection is slower than UDP.
- Each ongoing RPC uses one socket (important when running in limited ressources environment).
UDP RPC caracteristics:
- Parameters and returned values must be smaller than 8ko.
- No connection establishment => faster than TCP-based RPC.
- Timeout + retry (configurable) if packet loss.
- Only 1 socket used for both receiving and sending.
The API is the same for both types of RPC:
You can not only call functions but get the value of a variable, a variable in a table, ... You can access remotly everything except function pointers.
When doing function calls, the separate syntax, ie. a table with function name followed by each arguments is more efficient (less overhead on the remote node, faster calls), specially when arguments have a big size.
| server(port [, max]) | Run a new RPC server on port "port". "max" (only for TCP) RPCs (sockets) at the same time (default = unlimited). |
|---|---|
| a_call(ip, port, func_a, timeout) |
"func_a" is an array that contains (1) the name of the function to be
called (the first parameter) and (2) parameters for the function
being called (subsequent parameters).
If the RPC is successful, returns: true, numerical array with function return values If the RPC has failed, return: nil, error message |
| call(ip, port, func_a, timeout) |
Same as a_call() but directly returns the function call replies (like
if you call a local function). The problem is that it can be
difficult or impossible to distinguish between a function replying
an error and a transmission problem. To use only in non faulty environments or if one can easily detect the error with reply. |
| ping(ip, port[, timeout]) | RPC based ping. Returns true if the ping succeeded, false otherwise. |
| proxy(ip, port) |
Create a new rpc proxy for one host. Calls are expressed in a still
more natural manner. |
splay.net
Collections of methods to ease usage of TCP and UDP.
| server(func, port[, max]) |
Start a new TCP server listening on port "port". Each time a new
client connects, the function handler will be called with the socket
in parameter. "max" connections (sockets) at the same time
(default = unlimited) A call to socket close() will be done when the handler exit. |
|---|---|
| udp_helper(func, port) |
Return a new "UDP object" and start a server listening on port
"port" and calling the function "func" when we receive and incomming
UDP packet. If handler value is nil, for each packet a new event is fired with name "udp: This function permits to reuse to UDP socket used for the server. |
splay.log
Log system will permit to log messages of different levels through differents outputs.
Levels:
- (1) debug
- (2) notice
- (3) warning
- (4) error
- (5) print
| new(level, prefix) | Create a new log object that will log with level 'level' and append a 'prefix' before output. |
|---|---|
| level [variable] | From 1 (debug) to 5 (print), permits to choose the minimum log level. |
|
log:write(level, msg) log:debug(msg) log:notice(msg) log:warn(msg) log:error(msg) log:print(msg) |
All except write() are aliases to the write() function, logging with the level corresponding to their names. |
splay.misc
Container for various functions.
| dup(e) |
Recursivly dupplicates an element (generally a table). |
|---|---|
| split(s, sep) | Split a string 's' with 'sep'. Return a table. |
| size(t) | Return the size of a table, count all elements (see isize() for example). |
| isize(t) | Return the size of a table watching only the highest numerical index. |
| random_pick(t, n) | Pick 'n' elements from a table 't'. Return a new table with picked elements. |
| random_pick_one(t) | Pick one element from a table 't' and return it. |
| time() | Return unixtime (with milliseconds precision). |
| between_c(i, a, b) | Circular between. Return true if 'i' is between 'a' and 'b'. Useful for DHT overlay. |
| hash_ascii_to_byte(s) | Transform an hexadecimal ASCII string in bytes (2 times shorter). |
| gen_string(base, mult) | Generates a string resulting of 'mult' times 'base' concatenations. |
| table_concat(t1, t2) | Concatenate two tables into one. |
splay.llenc
Block encoding for network transfers.
| wrap(socket) | After wrapping, the socket can send and receive full blocks of any kind of data. |
|---|---|
| socket:send(block) | Send a block (can be a numerical array of blocks, in that case multiple send will be done). |
| socket:receive(max_length) | Receive a block or nil, "error" if too long. |
| socket:receive_array(number, max_length) | Same as receive() but wait "number" blocks and reply a table containing them. |
splay.json
Data encoding for network transfers. Used by various libraries like Log and RPCs.
| wrap(socket) | After wrapping, the socket can send and receive Lua complex structures (like tables). This wrapper will too use LLenc wraper. |
|---|---|
| socket:send(data) | Send a Lua data structure. |
| socket:receive() | Receive a Lua data structure or nil, error. |
splay.benc
Bittorrent encoding library.
| endcode(data) | Encode data with bencoding. |
|---|---|
| decode(data) | Decode bencoded data. |
splay.utils
Collections of methods for local testing.
| generate_job(position, [nb_nodes, [first_port, [list_size, [random]]]]) |
Generate a fake job variable for local testing. Then make a file "run.sh" with that content: And make it executable (chmod 755 run.sh). Now you can use it to run your local SPLAY application (./run.sh). |
|---|
splay.restricted_io
When deploying a SPLAY application, it will be sandboxed. This section will describe how sandboxing is done for the Lua IO library, particulary file system accesses.
Like restricted_socket do for LuaSocket, restricted_io will wrap original IO functions behind a security layer that will apply additionnal restrictions.
First the library need to be initialized via the init() function. With that function we will exactly set the restrictions we need and particulary give a path to a folder that will represent the root of the virtual restricted file system.
When you open a file in RIO (restricted IO), the file path will be mapped to a flat file using a MD5 hash. So all your virtual files will appear in the same directory with hash names. Then, additionnal restrictions will be applied like:
- number of file descriptors
- maximum number of files
- maximum disk space
| init(settings) |
RIO will not work correctly if not initialized, so if you want to
use it locally to test it, call this function first. |
|---|---|
| All Lua IO functions | Lua IO API |
LuaSocket
As previously said, we have kept the whole LuaSocket API intact adding restrictions layer.
We have too extended the syntax to accept nodes where possible.
But one must care to close() the socket after using it, because the close() functions will decrease the ressource counter (without doing that, the ressource will be freed by the garbage collector but the ressource counter will not be decreased).
Restrictions
When you deploy your SPLAY applications, they are run in a sandbox.
Here is the full list of removed functions:
- load()
- loadfile()
- dofile()
- os.execute()
- os.getenv()
- os.setlocale()
- io.popen()
- debug.*
All other Lua functions (and libraries) are available. Most IO functions have been wrapped to work in a secure virtual filesystem but the API is exactly the same.
If you really need to use one of these functions locally, you can as usual test for the function existence (or test if the 'job' variable is set) before calling it.
Extend SPLAY
SPLAY can be extended both by Lua modules (libraries) and/or C modules.
The additionnal module will be loaded as usual (see Lua documentation), but you need to edit jobd.lua and add its name to the white list of accepted modules.
Within official SPLAYd, we only include modules that don't present any security/ressources issues.