Spf4j 7.1.4 is out

08:26AM Mar 08, 2015 in category General by Zoltan Farkas

This release contains a lot of enhancements:

Object recycler had a few bugs fixed, and should be  production ready.

Added SizedObjectRecyclers which are very useful for recycling buffers. (ByteArrayBuilder can now use a recycled array.)

PipedOutputStream and PipedInputStream, a significantly better implementation that the stock jdk one, not only it is slightly faster, but the producer controls the byte handover with flush, having buffering semantics. This implementation supports timeouts as well by integrating with spf4j Runtime.get|setDeadline()

New UpdateablePriorityQueue implementation.

New Strings utilities, for fast to/from utf8 coding/decoding.

release is available in the central maven repo.



spf4j + flight recorder

06:49PM Nov 02, 2014 in category General by Zoltan Farkas

Spf4j has not a Flight recorder profiler integration for JMH, all you need to do to us it is:

        Options opt = new OptionsBuilder()
                .jvmArgs("-XX:+UnlockCommercialFeatures", "-Djmh.stack.profiles="
                        + System.getProperty("jmh.stack.profiles",
         new Runner(opt).run();

As you can see in the example above you can actually use spf4j profiler and flight recorder at the same time.

(not that it makes sense to do that :-) )

 enjoy, cheers!


Jmh + Spf4j

11:27AM Nov 02, 2014 in category General by Zoltan Farkas

I had some time this weekend to code due to bad weather :-), and I have integrated spf4j and jmh so that spf4j can be used to profile benchmarks. This way as you see a performance degradation you can immediately take a look at what potentially is the cause. All you need to do is to look at the ssdump files generated. (ex spf4j benchmark profiles).

Spf4j profiler is a better and lower overhead implementation compared with the JMH StackProfiler, however both suffer from safe point bias, which makes their results less accurate. (a lot of commercial profilers suffer from the same issue, I believe java flight recorder does not)

Spf4j will contain JMH profiler integration with java Flight recorder in the near future. 


Sleep sort in Zel

11:45AM Nov 01, 2014 in category General by Zoltan Farkas

One of the candidates we have been recently interviewing, as a anecdote implemented a sleep sort during the interview.

So I thought to myself, this can easily be implemented in ZEL as well, so here it is:

func sleepSort(x) {
  l = x.length;
  if l <= 0 {
    return x
  resChan = channel();
  max = x[0];
  sl = func (x, ch) {sleep x * 10; ch.write(x)};
  sl(max, resChan)&;
  for i = 1; i < l; i++ {
    val = x[i]; 
    sl(val, resChan)&;
    if (val > max) {
      max = val
  sleep (max + 1) * 10;
  for c = resChan.read(), i = 0; c != EOF; c = resChan.read(), i++ {
     x[i] = c
  return x

 and it works like a charm, enjoy!


Generating a Unique ID

12:13PM Oct 26, 2014 in category General by Zoltan Farkas

Most applications I encounter use UUID.randomUUID().toString() to generate unique IDs for various things like requests, transactions.... which is quite a slow implementaion.

Since I implemented a UID generator in SPF4J, I decided to do a little bit of benchmarking with JMH: 


and here are the results on my 4 core macbook pro: 

Benchmark                              Mode  Samples         Score        Error  Units

o.s.c.UIDGeneratorBenchmark.jdkUid    thrpt       60    261797.856 ±  11388.450  ops/s 

o.s.c.UIDGeneratorBenchmark.atoUid    thrpt       60   8102280.696 ± 159030.080  ops/s

o.s.c.UIDGeneratorBenchmark.scaUid    thrpt       60  25371629.029 ± 354517.591  ops/s

As you ca see the spf4j UID generator is 100x faster.

And as you can see it is significantly faster than the implementation using atomic instructions. In a lot of the code I stumble upon I see a lot of unjustified use, and the scalability impact is significant. 


SPF4J release 6.5.17 is OUT

07:00AM Sep 21, 2014 in category General by Zoltan Farkas

Release 6.5.17 is out, code and binaries at. Some of the notable changes:

 1) Added 3 measurement stores: tsdbtxt a simple text based format to store measurements. Graphite UDP store, and Graphite TCP store.

 2) ObjectPool is now called RecyclingSupplier, an extension to Guava Supplier. with 2 methods: get() and recycle(object)...

 3) Performance enhancements to further reduce the library overhead (and Heisenberg uncertainty principle)

 4) Retry methods in the Callable class have been further refined. A randomized Fibonacci back-off with immediate retries has been introduced as default.

 5) Added Either utility class.

6) Easy to export JMX operations and attributes. Simply annotate with @JmxExport the method or getters and setters and Register the object with the new Registry class and your done.


Apple Watch makes Yo obsolete(or not)

08:28PM Sep 09, 2014 in category General by Zoltan Farkas

One of the interesting features of apple watch is its instant messaging capability.

It allows you to send a YO in the most efficient way.

Based on popularity of YO, I see this as apple watch's killer feature :-).


spf4j jmx utilities enhaced

01:57PM Aug 22, 2014 in category General by Zoltan Farkas

Added some enhancements to the spf4j library to export attributes and operations.

All you need to do is annotate with @JmxExport your getters and setter and operations and call Registry.export() with your objects.

code @ http://www.spf4j.org



Why does history have to repeat itself?

10:52PM Aug 02, 2014 in category Java by Zoltan Farkas

I wonder if the large loss of life in the Iraq and Afghanistan wars was worth it… and I am pretty sure it was not…

13 years after 9/11, and 10 years after the initial 9/11 commission report

"Al Qaeda–affiliated groups are now active in more countries than before 9/11."

“The struggle against terrorism is far from over—rather, it has entered a new and dangerous phase.”

“A senior national security official told us that the forces of Islamist extremism in the Middle East are stronger than in the last decade.”

“ISIS now controls vast swaths of territory in Iraq and Syria, creating a massive terrorist sanctuary. One knowledgeable former Intelligence Community leader expressed concern that Afghanistan could revert to that condition once most American troops depart at the end of 2014.”

On PBS Frontline on Jul 29 somebody said about the new terrorist threat:

“This is Al Qaeda 6.0, they make Bin Laden’s Al Qaeda look like boy scouts”

I see the same failed strategy being employed by Israel in Gaza…  

The Israeli army is creating the next generation of Extremist that will make the previous one look like boy scouts…

Why does history have to repeat itself?


spf4j alternative java flight recorder

08:56PM Jul 11, 2014 in category General by Zoltan Farkas

With JDK update 40 Oracle released  Java Mission Control + Java flight recorder:

(for more detail see: http://docs.oracle.com/javase/8/docs/technotes/guides/jfr/)

As with spf4j you can implement continuous profiling, and there are some pros and cons of using Java Flight Recorder:

Java flight recorder has some implementation advantages that in theory will provide better data quality. Oracle calls the impact:"Zero performance overhead" which is sales BS, every engineer with the IQ greater than the room temperature knows that there is no such thing. However the overhead can be minimal and potentially lower that the spf4j, although not significantly lower.

But don't get ready to throw spf4j out of the window, java flight recorder is available only on the Oracle JVM, and is free to use in your test environments only, for production environments you will need to buy a license. Meanwhile spf4j you runs on any JVM, and is free to use in any environment.

Also some of the visualization is spf4j are in my view better...

In any case Java flight recorder is a great tool for implementing continuous profiling.



Easilly expose attributes and operations via JMX

10:14AM Jul 05, 2014 in category General by Zoltan Farkas

I implemented a small utility to export attributes and operations via jmx.

All you need to do is:

1) Annotate your attribute getter/setter or operation with @JmxExport

2) invoke: Registry.export("test", "Test", testObj1, testObj2...);

 and your attributes and methods will be available via JMX.

This is available in the latest version of spf4j


Parallel qsort in zel

08:43PM Apr 04, 2014 in category Java by Zoltan Farkas

I had a bit of time to implement some extra features in zel, just enough so that I can write quick sort in zel:

func qSortP(x, start, end) {
  l = end - start;
  if l < 2 {
  pidx = start + l / 2;
  pivot = x[pidx];
  lm1  = end - 1;
  x[pidx] <-> x[lm1];
  npv = start;
  for i = start; i < lm1; i++ {
    if x[i] < pivot {
      x[npv] <-> x[i];
      npv ++
  x[npv] <-> x[lm1];
  qSortP(x, start, npv)&;
  qSortP(x, npv + 1, end)&

qSortP(x, 0, x.length)

As you can see it is pretty much the standard implementation, and since it is ZEL it is parallel.

Parallel exec time = 510 ms
Parallel exec time = 470 ms
Parallel exec time = 473 ms
Single Threaded exec time = 1640 ms
Single Threaded exec time = 1528 ms
Single Threaded exec time = 1527ms

Tests executed on a quad core MacBook pro and show good scalability of the execution engine.

pretty cool, tests are at https://code.google.com/p/spf4j/source/browse/trunk/spf4j-zel/src/test/java/org/spf4j/zel/vm/QSort.java



ZEL has now channels.

09:22PM Mar 24, 2014 in category General by Zoltan Farkas

Here is a simple program where we have 1 producer and 10 consumers:

        ch = channel();
        func prod(ch) { for i = 0; i < 100 ; i++ { ch.write(i) }; ch.close()};
        func cons(ch, nr) {
            sum = 0;
            for v = ch.read(); v != EOF; v = ch.read() {
                out(v, ","); sum++ 
            out("fin(", nr, ",", sum, ")") 
        prod(ch); // start producer
        for i = 0; i < 10; i++ { cons(ch, i) } //start consumers

as with zel futures, channel operations do not block a thread.

ZEL coroutines are multiplexed over a pool of threads where channel.read and future.get(transparent)

are points where execution can be suspended.

Current channel implementation is a unbounded channel.


spf4j release 6.5.2

11:06PM Feb 23, 2014 in category General by Zoltan Farkas

I finally managed to  clean up and improve ZEL to make it worthy of being part of spf4j.

you can checkout source download binaries(from the maven repo) at www.spf4j.org



zel and replicas

08:44PM Feb 19, 2014 in category General by Zoltan Farkas

Implemented zel system function "first", which will  return the first value returned by a set of async invocations.

This is in general practical for implementing replica invocations, where we care about the first and fastest result.

Here is a dummy example:

replica = func async (x) {
    sleep random() * 1000;
    out(x, " finished\n");
    return x
out(first(replica(1), replica(2), replica(3)), " finished first\n");
sleep 1000

returns something like:

3 finished
3 finished first
2 finished
1 finished

As you can see in this case 3 finishes first. 2 and 1 finish afterwards, but the result are discarded.

Next on my list are exceptions and canceling async tasks where the result are not needed anymore...


Zel performance part II

08:39PM Feb 13, 2014 in category Java by Zoltan Farkas

Zel recursive Fibonacci implementation beats java, c++, erlang recursive implementations because of its o(n) characteristics. 

You can't compensate for a bad algorithm with the language choice.

fib = func det (x) {fib(x-1) + fib(x-2)};
fib(0) = 0;
fib(1) = 1;

However java, c, c++ outperform significantly zel in most cases.

I decided to compare zel against 2 similar languages: MVEL and SPEL.

Based on my micro-benchmarks ZEL looks  similar in performance with MVEL and significantly faster than SPEL.

latest tests are at , enjoy!



Zel concurrent programming and performance

10:28PM Feb 12, 2014 in category Java by Zoltan Farkas

Let's take my previous chapter example of calculating pi and see how it performs in sync mode (single threaded):

pi = func (x) {
  term = func (k) {4 * (-1 ** k) / (2d * k + 1)};
  for i = 0; i < x; i = i + 1 { parts[i] = term(i) };
  for result = 0, i = 0; i < x; i = i + 1 { result = result + parts[i] };
  return result

executes in about 450 ms

in parallel mode it executes in: 375 ms, we get a bit of a gain, but we pound the processors a bit more.

I have optimized the parallel implementation to:

piPart = func (s, x) {
  term = func sync (k) {4 * (-1 ** k) / (2d * k + 1)};
  for i = s; i < x; i = i + 1 {
    parts[i] = term(i)
  for result = 0, i = s; i < x; i = i + 1 {
    result = result + parts[i]
  return result

pi = func (x, breakup) {
  range = x / breakup;
  l = breakup - 1;
  for i = 0, result = 0, k = 0; i < l; i = i + 1 {
    part[i] = piPart(k, k + range);
    k = k + range
  part[i] = piPart(k, x);
  for i = 0, result = 0; i < breakup; i = i + 1 {
     result = result + part[i]
  return result
pi(100000, 5)

and it executes in about 230 ms, about twice faster than the single threaded implementation.

Tests have been executed on a 4 core laptop and they significantly impaired by power management which modifies frequency, disables cores ...

Performance it still far away from the single threaded java implementation which executes in 10 ms...

Will probably be able to get the times closer if I implement ++ and += in zel, but will probably still be far away from java.


Concurrent programming comparison with GO

10:49PM Feb 11, 2014 in category Java by Zoltan Farkas

Here is the parallel implementation  of calculating PI in ZEL:

pi = func (x) {
  term = func (k) {4 * (-1 ** k) / (2d * k + 1)};
  for i = 0; i < x; i = i + 1 { parts[i] = term(i) };
  for result = 0, i = 0; i < x; i = i + 1 { result = result + parts[i] }
  return result

The GO implementation looks like:

func main() {

func pi(n int) float64 {
    ch := make(chan float64)
    for k := 0; k <= n; k++ {
        go term(ch, float64(k))
    f := 0.0
    for k := 0; k <= n; k++ {
        f += <-ch
    return f

func term(ch chan float64, k float64) {
    ch <- 4 * math.Pow(-1, k) / (2*k + 1)

 ZEL async function calls do make the code more readable by having the concurrency completely out of the way.


Concurency/async programming in zel

08:45PM Feb 10, 2014 in category Java by Zoltan Farkas

One of the cool things about zel is concurrency.

Here is a simple example:

 f1 = func {sleep 5000; 1};
 f2 = func {sleep 5000; 2};
 f1() + f2()

this program will return 3 after about 5 seconds.

as you can see currently all functions are executed asynchronously,

no cumbersome futures syntax needed. the language will deal with the futures transparently.


ZEL lives again.

10:51AM Feb 08, 2014 in category General by Zoltan Farkas

I have cleaned up my good old ZEL expression evaluator.
The code now is not only cleaner, but I have added new functionality to the language.

New additions are async programming and memorization, which allow for pretty cool implementations.

With this we can implement the fibonacci function like:

fib = func det (x) {fib(x-1) + fib(x-2)};
fib(0) = 0;
fib(1) = 1;

with O(n) time and S(n) space characteristics.

which makes it possible to actually calculate large fibonacci numbers, unlike the closest implementation in java:

    public long fib(final long i) {
        if (i <= 1) {
            return i;
        } else {
            return fib(i - 1) + fib(i - 2);

where fib(40) takes to execute in about 5 ms in zel and 500 ms in java.

implementing fibonacci in java so that it actually works for large numbers looks like:

    public BigInteger fibBNr(final int i) {
        if (i <= 1) {
            return BigInteger.valueOf(i);
        } else {
            BigInteger v1 = BigInteger.ZERO;
            BigInteger v2 = BigInteger.ONE;
            BigInteger tmp;
            for (int j = 2; j <= i; j++) {
                tmp = v2;
                v2 = v1.add(v2);
                v1 = tmp;
            return v2;

and outperforms zel significantly: calculating fib(10000) in 5 ms while zel takes about 1900 ms 1600 ms 1200 ms 1000 ms.

This shows that there is significant overhead in the ZEL execution, but we are  not really comparing apples with apples since zel does memorization, and after computing fib(10000) calling fib(x) where x<=10000 will return in o(1) time.

If you need a fibonacci implementation with memorization the ZEL implementation is probably not a bad choice.

you can download the code from: http://code.google.com/p/spf4j/



New call graph visualization...

09:57AM Aug 31, 2013 in category General by Zoltan Farkas

Hot methods are not really well visible in the flame charts, so I has a idea to improve them...

I really like the result:


The UI is quite usable, I was able to detect and fix several performance issues in real production code already.

There are a few things to be improved on the UI, but over all it is a big step forward!

Other approaches are to use graphviz to visualize call graphs as suggested in:


which seems to be the way they are visualized at Google as well:



Continuous profiling with spf4j

08:21PM Jul 16, 2013 in category General by Zoltan Farkas

One of the nice things you can do with sf4j is continuous profiling,

you can start your process with:

${JAVA_HOME}/bin/java [your jvm args] org.spf4j.stackmonitor.Monitor -ss -si 100 -di 3600000 -df [folder] -dp [dump prefix] -main [your app main class] -- [your app arguments] 

this will  sample your stack every 100 ms, will dump the collected data to [folder] in files named [dump prefix]_[start time]_[endtime].ssdump 

you can directly use the Sampler api to achieve this. (look at org.spf4j.stackmonitor.Monitor source for an example)

this way you will always have available extra detail to troubleshoot performance issues,   it will allow you to proactively address performance issue and even discover bugs in your application.



spf4j 4.2 release

09:14PM Apr 17, 2013 in category General by Zoltan Farkas

A lot of new ideas finally have been materialized in spf4j.

Along with measurements (tsdb), stack sample data can not be serialized to a file (dumped to file at a interval (ssdump)).

I have developed a simple UI to visualize the data.