Tunguska II

Balanced Ternary Numeral System - forum was moved from http://ternary.info

Moderator: haqreu

eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

It isn't really that strange, since modern java code is compiled to native bytecode on runtime with JIT, utilizing all optimizations available for the host processor. A lot of the slowness of Java comes from the rather clunky object and garbage collection system (that my code avoids as much as possible in the speed-critical parts.)

I do have a multicore processor, but (and I've checked to make sure) the java vm only runs on one core, so it isn't doing some sort of sneaky parallel optimization.

Vectorization is done automatically with -O3 (I compiled the C test code with '-O3 -march=amdfam10'). With some further tweaking, I did manage to cut it down to 3 m 20 secs roughly, but that's still almost a minute slower than Java.
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

Could you please send me sources of your Java and C benchmarks and I will run it on my Intel Core 2 Duo machine - I don't believe in miracles ;)
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

Actually, I found the problem. A 'volatile' had snuck into the C code from an experiment I did with an inline assembly hack that wasn't worth the added complexity (this volatile of course screwed up optimization).

Now C runs in 2 minutes (java runs in 2m 30s)

Still uploaded the benchmarking code if you want to have a look: http://www.nedopc.org/ternary/bench.tar.gz

On a completely unrelated side note, I tried compiling the java code with gcj (again -O3). Interestingly, that was slower than JIT-compiled Java (3 minutes 15 seconds).
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

OK, thanks :)
My marks on Intel Core 2 Duo E4700 2.6GHz (JDK 1.6.0_14 and GCC 4.2.4):

Java 3m 10s
C-O2 2m 32s
C-O3 2m 03s
C-O3+ 1m 47s (-march=native -funroll-loops -fomit-frame-pointer)
C-O3++ 1m 45s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays)
C-O3+v 1m 43s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays -ftree-vectorize)

In all cases only 1 core was utilized

P.S. Modern proprietary JIT-compiler is much better than gcj, it's even better than most of commercial java native compilers ;)
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

Hmm, gcc manual says that -fomit-frame-pointer and -ftree-vectorize are both enabled at -O3 automatically, and -fprefetch-loop-arrays should be enabled at all levels but -Os, so they shouldn't really make any difference.


It's peculiar that your Java benchmark is so slow. You get roughly the same C speeds as me, but much slower java speeds (by almost a minute). Did you try running in the server vm as well as client?
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

I have Slackware 12.2 and standard Java distribution
-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging.
And you are right about -ftree-vectorize and -fprefetch-loop-arrays, but for some reason it gave me couple of seconds...
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

Heh, we run the same operating system version :-)

A 4 second speed difference is only around 1% on a test that runs for almost 2 minutes. The engineer in me tells me that small a difference is well within the size of the random measurement errors caused by the operating system.
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

I repeated C-O3+ test 3 times and all of them were exactly 1m 47s :)
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

Odd...

I've packaged the stuff I've written so far on the Java version of Tunguska, if anyone wants to poke around in the sources. It's not quite functional yet (I have only implemented around half the instruction set), but it's getting there...

Tunguska.zip
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

Shaos wrote: OK, thanks :)
My marks on Intel Core 2 Duo E4700 2.6GHz (JDK 1.6.0_14 and GCC 4.2.4):

Java 3m 10s
C-O2 2m 32s
C-O3 2m 03s
C-O3+ 1m 47s (-march=native -funroll-loops -fomit-frame-pointer)
C-O3++ 1m 45s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays)
C-O3+v 1m 43s (-march=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays -ftree-vectorize)
PowerBook G4 1.67GHz MacOS X 10.4.11 with Java 1.5.0_13 and GCC 4.0.0:

Java 7m 55s
C-O2 3m 06s
C-O3 2m 32s

Any additional options didn't help at all (including -mcpu=G4)
hemuman

Re: Tunguska II

Post by hemuman »

by eudoxie on 2009/6/21 3:21:27

Odd...

I've packaged the stuff I've written so far on the Java version of Tunguska, if anyone wants to poke around in the sources. It's not quite functional yet (I have only implemented around half the instruction set), but it's getting there...

Tunguska.zip
I would love to work on those but, only next month :(
m away from my system :(
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

No hurry. It won't be operational for a few more weeks. Right now I'm building a makeshift assembler so that I can begin to actually test the instruction code.

I've got a lot of free time now, though, so I get a lot of work done. It's already almost 2500 lines of code :-)
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

eudoxie wrote: No hurry. It won't be operational for a few more weeks. Right now I'm building a makeshift assembler so that I can begin to actually test the instruction code.

I've got a lot of free time now, though, so I get a lot of work done. It's already almost 2500 lines of code :-)
Is it possible to implement input and output interfaces as abstract as possible to make it easily portable to Android? Google Java doesn't have AWT and regular keyboard/mouse events... ;)
eudoxie
Maniac
Posts: 277
Joined: 17 Sep 2012 13:36
Location: 81.170.128.52

Re: Tunguska II

Post by eudoxie »

That will not be a problem, since I'm writing it as modularized as possible. The only packages I'm importing are from java.util and java.lang.reflect.
User avatar
Shaos
Admin
Posts: 24379
Joined: 08 Jan 2003 23:22
Location: Silicon Valley

Re: Tunguska II

Post by Shaos »

eudoxie wrote: That will not be a problem, since I'm writing it as modularized as possible. The only packages I'm importing are from java.util and java.lang.reflect.
http://developer.android.com/reference/ ... mmary.html
http://developer.android.com/reference/ ... mmary.html