Search This Blog

Sunday, January 26, 2020

Using arrays with parallel

OpenAF is a mix of Javascript and Java, but "pure" javascript isn't "thread-safe" in the Java world. Nevertheless being able to use Java threads with Javascript is very useful starting with performance.

One common pitfall is not paying attention to what is "thread-safe" and what isn't. Meaning, what is "aware" that it can be running on a multi-thread environment where several threads might be competing for access to a resource and what isn't.

Let's see a pratical example of a common pitfall with arrays.

Example of what should NOT be done

In this example we will create a simple array and push new elements in parallel. So we would expect that it would have the same number of elements always:

var targetArray = [];
// making up a source array with 10 elements to push to targetArray
var sourceArray = repeat(9, '-').split(/-/);
parallel4Array(source, v => {
    targetArray.push(v);
    return v;
});

print(sourceArray.length);  // 10
print(targetArray.length);  // 10

Great, so for a source array and target array are equal. It works, right? Let's increase the "thread competition" to 10000 and compare again:

var targetArray = [];
var sourceArray = repeat(9999, '-').split(/-/);
parallel4Array(source, v => {
    targetArray.push(v);
    return v;
});

print(sourceArray.length);  // 10000
print(targetArray.length);  // 9961

There are no longer equal. And if you run it again and increase the number of elements on the source array the difference will increase.

This is what happens when you forget that a javascript array is not thread-safe (and it's actually a good thing because being thread-safe is usually slower when you are not using threads).

Examples of what should be done

Using sync

The first immediate way to solve this is to ensure that just one thread will access the targetArray at a time. You can do this in OpenAF with the sync function:

var targetArray = [];
var sourceArray = repeat(9999, '-').split(/-/);
parallel4Array(source, v => {
    sync(() => {
        targetArray.push(v);
    }, targetArray);
    return v;
});

print(sourceArray.length);  // 10000
print(targetArray.length);  // 10000

The sync function will use Java underneath (synchronize) to ensure that only one thread at a time accessing targetArray will execute the provided function.

So the problem is solved, right? Well if you measure the performance of using sync and not using sync you will notice that it can be up to twice as bad.

Why? Because when one thread is inserting a value into targetArray all other threads have to stop and wait. That will slow down everything and probably make it not so much as effective as running it sequentially (depending on the processing outside the sync function).

Using syncArray

On the OpenAF library ow.obj there is a wrapper for you to use a "thread-safe" java array: ow.obj.syncArray

ow.loadObj();
var targetArray = new ow.obj.syncArray();
var sourceArray = repeat(99999, '-').split(/-/);
parallel4Array(source, v => {
    targetArray.add(v);
    return v;
});

print(sourceArray.length);            // 100000
print(targetArray.toArray().length);  // 100000

The java array version is optimized to be faster in these conditions. So the bigger is the sourceArray the bigger the benefits with ow.obj.syncArray.

The ow.obj.syncArray has more methods that you can explore from the OpenAF's help including: addAll, clear, get, indexOf, length, remove and even getJavaObject that will let you iteract with the original java object directly.

Comparison

Changing the above examples to perform something "harder" for each array element like calculating the Math.sin of each value and then comparing the performance you would get something similar to:

Strategy Source size Target size Average time
Parallel 100000 <100000 2.11s
Sync 100000 100000 2.44s
ow.obj.syncArray 100000 100000 2.06s

But what's the time if it's done sequentially? In this simple example: a lot better (~1.1s). Keep in mind that you only gain a performance advantage when the time spent dealing with threads and concurrent access is lot lower relatively to the time spent processing each array element.

So, in conclusion, there is no right or wrong answer. You need to test to get the best for your case.

Tuesday, January 14, 2020

Relaxed JSON parser

In OpenAF, when using the jsonParser function, the parsing sticks to the strict JSON definition.

For example the following behaves as expected:

> jsonParser("{ \"a\": 1 }");
{
   "a": 1
}
> JSON.parse("{ \"a\": 1 }");
{
   "a": 1
}

But using a more "relaxed" JSON definition, the same functions will fail:

> jsonParser("{ a: 1 }");
{ a: 1 }
> JSON.parse("{ a: 1 }");
-- SyntaxError: Unexpected token in object literal

The jsonParser function will return the text string representation as it's unable to parse the JSON string. The native JSON.parse will actually throw an execption.

Using GSON

OpenAF includes the GSON library which can parse more "relaxed" JSON definitions. Since OpenAF version 20200108, the OpenAF jsonParser function can also use the GSON library. There is a new second boolean argument that if true alternates to GSON for parsing the string provided on the first argument:

> jsonParse("{ a: 1 }", true);
{
   "a": 1
}

Monday, January 13, 2020

oJob exception handling

When you create an OpenAF's oJob each job runs independently and you are responsible for handling exceptions on each.

But couldn't we have a global try/catch function for all jobs? Yes, you can add a function with ojob.catch.

oJob throwing exceptions example

todo:
  - Normal job
  - Bug job
  - Another Bug job

jobs:
  #-----------------
  - name: Normal job
    exec: |
      print("I'm a normal job.");

  #--------------
  - name: Bug job
    exec: |
      print("I'm a buggy job.");
      throw "BUG!";

  #----------------------
  - name: Another Bug job
    exec: |
      print("I'm another buggy job.");
      throw "BUG!";

This ojob has 3 jobs. "Normal job" will execute without throwing any exceptions. But "Bug job" and "Another Bug job" will throw two exceptions when executed.

Executing you will see the three jobs executing with two of them failing. The entire oJob process will finish with exit code 0.

oJob general exception example

todo:
  - Normal job
  - Bug job
  - Another Bug job

ojob:
  catch: |
    var msg = "Error executing job '" + job.name + "'\n";
    msg += "\nArguments:\n" + stringify(args, void 0, "") + "\n";
    msg += "\nException:\n" + stringify(exception, void 0, "") + "\n";
    logErr(msg, { async: false });

    if (String(exception) == "BUG!") exit(1);

jobs:
  #-----------------
  - name: Normal job
    exec: |
      print("I'm a normal job.");

  #--------------
  - name: Bug job
    exec: |
      print("I'm a buggy job.");
      throw "BUG!";

  #----------------------
  - name: Another Bug job
    exec: |
      print("I'm another buggy job.");
      throw "BUG!";

This example is equal to the previous one but it adds ojob.catch. The catch function receives several arguments:

Argument Type Description
args Map The current args map at the point of the exception.
job Map The job map where the exception occurred.
id Number If the job was executed within a specific sub-id.
deps Map A map of job dependencies.
exception Exception The exception itself.

If the function returns true the exception will be recorded as usual and the job will be registered has failed. If the function returns false the exception will be ignored.

In this example the function will actually stop the entire oJob process with exit code 1.

Sunday, January 12, 2020

Using channel peers

If you use OpenAF channels there are some cases where you would like to connect them across OpenAF scripts.

To achieve this you can use the $ch.expose and the $ch.createRemote functions that allow you to expose an internal OpenAF to be accessible by other OpenAF scripts remotely.

But if the original OpenAF script which exposed the channel "dies" all the others will no longer be able to access the corresponding data. This might not be the desired scenario in some cases.

Peering

But making a mix of expose and createRemote functions between several OpenAF scripts you can achieve what is provided by the \$ch.peer function.

The \$ch.peer will expose an OpenAF channel and exchange sync data with a set of "peer" OpenAF scripts that just have to execute the same similar command:

$ch("myChannel").peer(12340, 
                      "/myURI", 
                      [ "http://script01:12340/myURI", 
                        "http://script02:12341/myURI", 
                        "http://script03:12343/myURI"]);

The signature is:

$ch.peer(aLocalPortOrServer, aPath, aRemoteURL, aAuthFunc, aUnAuthFunc)

The parameters are similar to the $ch.expose function:

Parameter Type Description
aLocalPortOrServer Object/Number The local port or a HTTPServer object to hold the HTTP/HTTPs server.
aPath String The URI where the channel interaction will be performed.
aRemoteURL Array An array of peer URLs
aAuthFunc Function This function will be called with user and password. If returns true the authentication is successful.
aUnAuthFunc Function This function will be called with the http server and the http request.

If you call it, after the first time, with a different array of aRemoteURL it will replace the existing. If the array references itself it will be ignored.

Unpeering

To remove a peering you can call:

$ch.unpeer(aRemoteURL)

Testing

Setting up data channels in each peer

First script (on host h1):

$ch("data").peer(9080, "/data", [ "http://h1:9080/data", "http://h1:9090/data", "http://h2:9080/data" ]);

Second script (on host h1):

$ch("data").peer(9090, "/data", [ "http://h1:9080/data", "http://h1:9090/data", "http://h2:9080/data" ]);

Third script (on host h2):

$ch("data").peer(9080, "/data", [ "http://h1:9080/data", "http://h1:9090/data", "http://h2:9080/data" ]);

Changing and retrieving data

  1. On host h2:
> $ch("data").size()
0
  1. On host h1:
> $ch("data").size()
0
> $ch("data").setAll(["canonicalPath"], io.listFiles(".").files);
> $ch("data").size()
25
  1. On host h2:
> $ch("data").size()
25

Using arrays with parallel

OpenAF is a mix of Javascript and Java, but "pure" javascript isn't "thread-safe" in the Java world. Nevertheless be...