I'm not being cryptic, I'm just trolling -- well, a little. I was
suspecting that your C implementation of A was memoized and the Hoon
version wasn't.
You definitely don't need to make your JIT do u3R->pro_nox_d += 1!
That should have a profiling ifdef around it, anyway -- it's simply a
Nock instruction count.
I think one of the most important optimizations in a JIT is to connect
jets together, so that when you call decrement from JITted code, it
calls decrement, instead of having to burrow down 30 steps into a core
to find the decrement formula, match the jet, and so on. I am always
surprised by my profiler reporting that Hoon programs are spending
relatively little time doing this (it's the 'f' for 'fragment' part of
the profiling result).
In a way, the fundamental investigation of performance that you're
doing here is germane not only to the JIT, but to the interpreter and
all code that uses Nock. So very interested...
Post by Paul DriverWhich algorithm do you mean? If you mean the C implementation of A vs the
hoon one i was benchmarking, they're the same. It's a very simple function.
The C one, however, doesn't have to do a pointer chase down into the hoon
kernel every time it wants to call decrement, for example. It's also not
working on nouns, but on native machine ints (and thus, less correct. But
correct enough for A(3,9)). Even accessing locals involves pointer chasing
(*with* is_cell check, i.e. a bit-shift, and, branch). None of that is
happening in the C code. We can optimize a lot of that away, though.
Looking at the generated code, there's also way more calls to gain/lose than
there need to be be, and it spends more time doing things like
u3R->pro.nox_d += 1 than it needs to. These are fixable issues.
If you mean some other thing by wondering about the algorithm, though,
please be less cryptic.
Post by Paul DriverRight now the compiler is extremely naive. I have some ideas kicking
around in my head for optimization. I think I can make it faster. Stay tuned
(but not too tuned).
Post by Curtis YarvinHa! With this many orders of magnitude, one has to wonder about the algorithm...
Sent from my iPhone
real 0m0.008s
user 0m0.007s
sys 0m0.000s
Maybe I'm doing it wrong :)
Post by Curtis YarvinNock is not too bad but it's not magic, so the jit should win much
harder here. I'd expect a 10x disparity -- what do you think the cause is?
Also, what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis YarvinI'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul DriverReally, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a compiler
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis YarvinA profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul DriverI don't know! It computes nock correctly. A fakezod comes up and things
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can
talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this
point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some
running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't
wait
to see the video from lamdaconf!
Post by Curtis YarvinI have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul DriverI've used libjit to implement a jit for u3. It compiles the
nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
change(?), unless someone knows how to use those mysterious
future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.