Tunguska II

eudoxie · Post by **eudoxie** » 19 Jun 2009 19:13

It isn't really that strange, since modern java code is compiled to native bytecode on runtime with JIT, utilizing all optimizations available for the host processor. A lot of the slowness of Java comes from the rather clunky object and garbage collection system (that my code avoids as much as possible in the speed-critical parts.)

I do have a multicore processor, but (and I've checked to make sure) the java vm only runs on one core, so it isn't doing some sort of sneaky parallel optimization.

Vectorization is done automatically with -O3 (I compiled the C test code with '-O3 -march=amdfam10'). With some further tweaking, I did manage to cut it down to 3 m 20 secs roughly, but that's still almost a minute slower than Java.

Shaos · Post by **Shaos** » 19 Jun 2009 21:39

Could you please send me sources of your Java and C benchmarks and I will run it on my Intel Core 2 Duo machine - I don't believe in miracles

eudoxie · Post by **eudoxie** » 19 Jun 2009 22:03

Actually, I found the problem. A 'volatile' had snuck into the C code from an experiment I did with an inline assembly hack that wasn't worth the added complexity (this volatile of course screwed up optimization).

Now C runs in 2 minutes (java runs in 2m 30s)

Still uploaded the benchmarking code if you want to have a look: http://www.nedopc.org/ternary/bench.tar.gz

On a completely unrelated side note, I tried compiling the java code with gcj (again -O3). Interestingly, that was slower than JIT-compiled Java (3 minutes 15 seconds).

Shaos · Post by **Shaos** » 20 Jun 2009 00:33

OK, thanks

My marks on Intel Core 2 Duo E4700 2.6GHz (JDK 1.6.0_14 and GCC 4.2.4):

Java 3m 10s
C-O2 2m 32s
C-O3 2m 03s
C-O3+ 1m 47s (-march=native -funroll-loops -fomit-frame-pointer)
C-O3++ 1m 45s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays)
C-O3+v 1m 43s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays -ftree-vectorize)

In all cases only 1 core was utilized

P.S. Modern proprietary JIT-compiler is much better than gcj, it's even better than most of commercial java native compilers

eudoxie · Post by **eudoxie** » 20 Jun 2009 08:06

Hmm, gcc manual says that -fomit-frame-pointer and -ftree-vectorize are both enabled at -O3 automatically, and -fprefetch-loop-arrays should be enabled at all levels but -Os, so they shouldn't really make any difference.

It's peculiar that your Java benchmark is so slow. You get roughly the same C speeds as me, but much slower java speeds (by almost a minute). Did you try running in the server vm as well as client?

Shaos · Post by **Shaos** » 20 Jun 2009 08:23

I have Slackware 12.2 and standard Java distribution

-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging.

And you are right about -ftree-vectorize and -fprefetch-loop-arrays, but for some reason it gave me couple of seconds...

eudoxie · Post by **eudoxie** » 20 Jun 2009 10:24

Heh, we run the same operating system version

A 4 second speed difference is only around 1% on a test that runs for almost 2 minutes. The engineer in me tells me that small a difference is well within the size of the random measurement errors caused by the operating system.

Shaos · Post by **Shaos** » 20 Jun 2009 14:43

I repeated C-O3+ test 3 times and all of them were exactly 1m 47s

eudoxie · Post by **eudoxie** » 20 Jun 2009 15:21

Odd...

I've packaged the stuff I've written so far on the Java version of Tunguska, if anyone wants to poke around in the sources. It's not quite functional yet (I have only implemented around half the instruction set), but it's getting there...

Tunguska.zip

Shaos · Post by **Shaos** » 20 Jun 2009 19:38

Shaos wrote: OK, thanks
My marks on Intel Core 2 Duo E4700 2.6GHz (JDK 1.6.0_14 and GCC 4.2.4):

Java 3m 10s
C-O2 2m 32s
C-O3 2m 03s
C-O3+ 1m 47s (-march=native -funroll-loops -fomit-frame-pointer)
C-O3++ 1m 45s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays)
C-O3+v 1m 43s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays -ftree-vectorize)

PowerBook G4 1.67GHz MacOS X 10.4.11 with Java 1.5.0_13 and GCC 4.0.0:

Java 7m 55s
C-O2 3m 06s
C-O3 2m 32s

Any additional options didn't help at all (including -mcpu=G4)

hemuman · Post by **hemuman** » 22 Jun 2009 13:29

by eudoxie on 2009/6/21 3:21:27

Odd...

I've packaged the stuff I've written so far on the Java version of Tunguska, if anyone wants to poke around in the sources. It's not quite functional yet (I have only implemented around half the instruction set), but it's getting there...

Tunguska.zip

I would love to work on those but, only next month

m away from my system

eudoxie · Post by **eudoxie** » 23 Jun 2009 15:36

No hurry. It won't be operational for a few more weeks. Right now I'm building a makeshift assembler so that I can begin to actually test the instruction code.

I've got a lot of free time now, though, so I get a lot of work done. It's already almost 2500 lines of code

Shaos · Post by **Shaos** » 23 Jun 2009 18:10

eudoxie wrote: No hurry. It won't be operational for a few more weeks. Right now I'm building a makeshift assembler so that I can begin to actually test the instruction code.

I've got a lot of free time now, though, so I get a lot of work done. It's already almost 2500 lines of code

Is it possible to implement input and output interfaces as abstract as possible to make it easily portable to Android? Google Java doesn't have AWT and regular keyboard/mouse events...

eudoxie · Post by **eudoxie** » 23 Jun 2009 18:32

That will not be a problem, since I'm writing it as modularized as possible. The only packages I'm importing are from java.util and java.lang.reflect.

Shaos · Post by **Shaos** » 23 Jun 2009 18:57

eudoxie wrote: That will not be a problem, since I'm writing it as modularized as possible. The only packages I'm importing are from java.util and java.lang.reflect.

http://developer.android.com/reference/ ... mmary.html
http://developer.android.com/reference/ ... mmary.html

Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II

Re: Tunguska II