Discussion:
[urbit] Just-in-Time Compiler for u3
Paul Driver
2016-05-30 03:25:11 UTC
Permalink
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.

https://github.com/frodwith/urbit/tree/jit

Feedback and performance testing is welcome. Hopefully this is useful
enough to get integrated into master.

I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-05-30 03:45:28 UTC
Permalink
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...

What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-05-30 05:48:57 UTC
Permalink
I don't know! It computes nock correctly. A fakezod comes up and things
work at the dojo. These things I know :)

I don't know how the profiler or memory leak tester work (yet). We can talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.

I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!

Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't
wait to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful
enough
Post by Paul Driver
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-05-30 17:53:45 UTC
Permalink
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!

Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.

To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
I don't know! It computes nock correctly. A fakezod comes up and things work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-05-30 18:19:23 UTC
Permalink
Really, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a
compiler before, and I'm only *starting* to get a good handle on u3
internals.

I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.

Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and things
work
Post by Paul Driver
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can
talk
Post by Paul Driver
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this
point
Post by Paul Driver
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some
running
Post by Paul Driver
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't
wait
Post by Paul Driver
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
change(?), unless someone knows how to use those mysterious
future-proof
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-05-30 18:42:26 UTC
Permalink
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.

I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a compiler
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
I don't know! It computes nock correctly. A fakezod comes up and things work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-05-30 20:55:13 UTC
Permalink
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)

Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush,
you
Post by Paul Driver
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a
compiler
Post by Paul Driver
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to
you.
Post by Paul Driver
Yes, the benchmark is separate, but related, in that there is probably
some
Post by Paul Driver
opportunity for performance enhancements that I can find with the
profiler.
Post by Paul Driver
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and
things
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We
can
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet.
But
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in
:talk,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
but, you guys have been focused on other important things lately.
Can't
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel
ambushed!
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock
for
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-05-30 22:12:47 UTC
Permalink
Nock is not too bad but it's not magic, so the jit should win much harder here. I'd expect a 10x disparity -- what do you think the cause is? Also, what's the C performance of A(3, 9)? 30% is nice but I'm greedy!

Sent from my iPhone
Post by Paul Driver
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a compiler
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
I don't know! It computes nock correctly. A fakezod comes up and things work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful
enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-05-30 22:36:52 UTC
Permalink
C performance:
real 0m0.008s
user 0m0.007s
sys 0m0.000s

Maybe I'm doing it wrong :)
Post by Curtis Yarvin
Nock is not too bad but it's not magic, so the jit should win much harder
here. I'd expect a 10x disparity -- what do you think the cause is? Also,
what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush,
you
Post by Paul Driver
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a
compiler
Post by Paul Driver
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to
you.
Post by Paul Driver
Yes, the benchmark is separate, but related, in that there is probably
some
Post by Paul Driver
opportunity for performance enhancements that I can find with the
profiler.
Post by Paul Driver
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and
things
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We
can
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
talk
about it over the next couple of days. Certain things are bound to
be
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
faster, and maybe we can find some easy performance gains. Up to
this
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet.
But
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in
:talk,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
but, you guys have been focused on other important things lately.
Can't
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel
ambushed!
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
In a good way! I mean, there are pull requests, and there are
pull
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned
the
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock
for
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
enough
to get integrated into master.
I did have to change the road structure, so this is probably a breaching
change(?), unless someone knows how to use those mysterious future-proof
fields to make it not...
--
You received this message because you are subscribed to the
Google
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-05-30 23:06:05 UTC
Permalink
Ha! With this many orders of magnitude, one has to wonder about the algorithm...

Sent from my iPhone
Post by Paul Driver
real 0m0.008s
user 0m0.007s
sys 0m0.000s
Maybe I'm doing it wrong :)
Post by Curtis Yarvin
Nock is not too bad but it's not magic, so the jit should win much harder here. I'd expect a 10x disparity -- what do you think the cause is? Also, what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Post by Paul Driver
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a compiler
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and things
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is useful
enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
change(?), unless someone knows how to use those mysterious
future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-06-01 14:50:00 UTC
Permalink
Right now the compiler is extremely naive. I have some ideas kicking around
in my head for optimization. I think I can make it faster. Stay tuned (but
not too tuned).
Post by Curtis Yarvin
Ha! With this many orders of magnitude, one has to wonder about the algorithm...
Sent from my iPhone
real 0m0.008s
user 0m0.007s
sys 0m0.000s
Maybe I'm doing it wrong :)
Post by Curtis Yarvin
Nock is not too bad but it's not magic, so the jit should win much harder
here. I'd expect a 10x disparity -- what do you think the cause is? Also,
what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush,
you
Post by Paul Driver
see. Code bombs are a headache. Also it would be nice to have some
code
Post by Paul Driver
review in case I'm doing something boneheaded. I've never written a
compiler
Post by Paul Driver
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to
you.
Post by Paul Driver
Yes, the benchmark is separate, but related, in that there is probably
some
Post by Paul Driver
opportunity for performance enhancements that I can find with the
profiler.
Post by Paul Driver
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer
uses
Post by Paul Driver
Post by Curtis Yarvin
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only
use
Post by Paul Driver
Post by Curtis Yarvin
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and
things
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We
can
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
talk
about it over the next couple of days. Certain things are bound to
be
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
faster, and maybe we can find some easy performance gains. Up to
this
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some running
code to talk about. So this is definitely *not* a pull request yet.
But
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in
:talk,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
but, you guys have been focused on other important things lately.
Can't
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel
ambushed!
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
In a good way! I mean, there are pull requests, and there are
pull
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned
the
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the nock
for
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
change(?), unless someone knows how to use those mysterious
future-proof
fields to make it not...
--
You received this message because you are subscribed to the
Google
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from
it,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Paul Driver
2016-06-01 16:48:50 UTC
Permalink
Which algorithm do you mean? If you mean the C implementation of A vs the
hoon one i was benchmarking, they're the same. It's a very simple function.
The C one, however, doesn't have to do a pointer chase down into the hoon
kernel every time it wants to call decrement, for example. It's also not
working on nouns, but on native machine ints (and thus, less correct. But
correct enough for A(3,9)). Even accessing locals involves pointer chasing
(*with* is_cell check, i.e. a bit-shift, and, branch). None of that is
happening in the C code. We can optimize a lot of that away, though.

Looking at the generated code, there's also way more calls to gain/lose
than there need to be be, and it spends more time doing things like
u3R->pro.nox_d += 1 than it needs to. These are fixable issues.

If you mean some other thing by wondering about the algorithm, though,
please be less cryptic.
Post by Paul Driver
Right now the compiler is extremely naive. I have some ideas kicking
around in my head for optimization. I think I can make it faster. Stay
tuned (but not too tuned).
Post by Curtis Yarvin
Ha! With this many orders of magnitude, one has to wonder about the algorithm...
Sent from my iPhone
real 0m0.008s
user 0m0.007s
sys 0m0.000s
Maybe I'm doing it wrong :)
Post by Curtis Yarvin
Nock is not too bad but it's not magic, so the jit should win much
harder here. I'd expect a 10x disparity -- what do you think the cause is?
Also, what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to
ambush, you
Post by Paul Driver
see. Code bombs are a headache. Also it would be nice to have some
code
Post by Paul Driver
review in case I'm doing something boneheaded. I've never written a
compiler
Post by Paul Driver
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to
you.
Post by Paul Driver
Yes, the benchmark is separate, but related, in that there is
probably some
Post by Paul Driver
opportunity for performance enhancements that I can find with the
profiler.
Post by Paul Driver
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer
uses
Post by Paul Driver
Post by Curtis Yarvin
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big
the
Post by Paul Driver
Post by Curtis Yarvin
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will
be
Post by Paul Driver
Post by Curtis Yarvin
used to find the inner loops to target. That's normally automatic
in
Post by Paul Driver
Post by Curtis Yarvin
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only
use
Post by Paul Driver
Post by Curtis Yarvin
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and
things
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We
can
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
talk
about it over the next couple of days. Certain things are bound to
be
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
faster, and maybe we can find some easy performance gains. Up to
this
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have
some
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
running
code to talk about. So this is definitely *not* a pull request
yet. But
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in
:talk,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
but, you guys have been focused on other important things lately.
Can't
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel
ambushed!
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
In a good way! I mean, there are pull requests, and there are
pull
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned
the
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the
nock for
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
change(?), unless someone knows how to use those mysterious
future-proof
fields to make it not...
--
You received this message because you are subscribed to the
Google
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from
it,
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Paul Driver
Post by Curtis Yarvin
Post by Paul Driver
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups
Post by Paul Driver
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send an
Post by Paul Driver
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Curtis Yarvin
2016-06-01 17:20:44 UTC
Permalink
I'm not being cryptic, I'm just trolling -- well, a little. I was
suspecting that your C implementation of A was memoized and the Hoon
version wasn't.

You definitely don't need to make your JIT do u3R->pro_nox_d += 1!
That should have a profiling ifdef around it, anyway -- it's simply a
Nock instruction count.

I think one of the most important optimizations in a JIT is to connect
jets together, so that when you call decrement from JITted code, it
calls decrement, instead of having to burrow down 30 steps into a core
to find the decrement formula, match the jet, and so on. I am always
surprised by my profiler reporting that Hoon programs are spending
relatively little time doing this (it's the 'f' for 'fragment' part of
the profiling result).

In a way, the fundamental investigation of performance that you're
doing here is germane not only to the JIT, but to the interpreter and
all code that uses Nock. So very interested...
Post by Paul Driver
Which algorithm do you mean? If you mean the C implementation of A vs the
hoon one i was benchmarking, they're the same. It's a very simple function.
The C one, however, doesn't have to do a pointer chase down into the hoon
kernel every time it wants to call decrement, for example. It's also not
working on nouns, but on native machine ints (and thus, less correct. But
correct enough for A(3,9)). Even accessing locals involves pointer chasing
(*with* is_cell check, i.e. a bit-shift, and, branch). None of that is
happening in the C code. We can optimize a lot of that away, though.
Looking at the generated code, there's also way more calls to gain/lose than
there need to be be, and it spends more time doing things like
u3R->pro.nox_d += 1 than it needs to. These are fixable issues.
If you mean some other thing by wondering about the algorithm, though,
please be less cryptic.
Post by Paul Driver
Right now the compiler is extremely naive. I have some ideas kicking
around in my head for optimization. I think I can make it faster. Stay tuned
(but not too tuned).
Post by Curtis Yarvin
Ha! With this many orders of magnitude, one has to wonder about the algorithm...
Sent from my iPhone
real 0m0.008s
user 0m0.007s
sys 0m0.000s
Maybe I'm doing it wrong :)
Post by Curtis Yarvin
Nock is not too bad but it's not magic, so the jit should win much
harder here. I'd expect a 10x disparity -- what do you think the cause is?
Also, what's the C performance of A(3, 9)? 30% is nice but I'm greedy!
Sent from my iPhone
Benchmark: A(3,9) (see https://en.wikipedia.org/wiki/Ackermann_function)
Takes about 25 seconds unjitted, about 17 seconds jitted.
Post by Curtis Yarvin
I'd probably add a million entries to a set or something like that.
If you take an actually jetted routine (like add:in) and do a
three-way comparison between Nock/JIT/C, that's not uninteresting.
All benchmarks are lies, anyway.
I am willing to deal with this species of headache!
Post by Paul Driver
Really, no timing estimates yet. I am actually trying *not* to ambush, you
see. Code bombs are a headache. Also it would be nice to have some code
review in case I'm doing something boneheaded. I've never written a compiler
before, and I'm only *starting* to get a good handle on u3 internals.
I'll play with the profiler and memory debugger soon and get back to you.
Yes, the benchmark is separate, but related, in that there is probably some
opportunity for performance enhancements that I can find with the profiler.
Profilers often tell you interesting things.
Do you have any suggestions though for some code to benchmark?
Post by Curtis Yarvin
A profiler (I almost wrote "brofiler," which is what a brogrammer uses
to optimize his brograms) is not in my opinion to be confused with a
benchmark. All benchmarks are lies, but I'd just write some simple
algorithm and time it for a seat-of-the-pants estimate of how big the
win is -- in orders of magnitude, really. Surely already you have
some code that's getting jitted, or you would have spent longer
plotting your ambush!
Our profiler, which really is rather good (just run with -P), will be
used to find the inner loops to target. That's normally automatic in
a JIT, of course, but for us it's manual. Which I for one much
prefer.
To activate the garbage collector, compile with U3_MEMORY_DEBUG and
U3_CELLOC_TOGGLE in include/noun/allocate.h, and run urbit -g. This
will run a tracing garbage collection after every event. The only use
for this is debugging (and possibly it should run occasionally at
runtime just in case).
Post by Paul Driver
I don't know! It computes nock correctly. A fakezod comes up and things
work
at the dojo. These things I know :)
I don't know how the profiler or memory leak tester work (yet). We can
talk
about it over the next couple of days. Certain things are bound to be
faster, and maybe we can find some easy performance gains. Up to this
point
I've been focused on getting it to *work*.
I want to discuss the architecture, too, but it's nice to have some
running
code to talk about. So this is definitely *not* a pull request yet. But
look, it runs!
Sorry about the ambush. I've mentioned some things cryptically in :talk,
but, you guys have been focused on other important things lately. Can't
wait
to see the video from lamdaconf!
Post by Curtis Yarvin
I have no comment, I just like saying "holy kamoly!" I feel ambushed!
In a good way! I mean, there are pull requests, and there are pull
requests...
What's the performance like? Or rather, since this is a hard
question, what are seat-of-pants numbers? Also, have you turned the
garbage collector on to do leak testing?
Post by Paul Driver
I've used libjit to implement a jit for u3. It compiles the
nock for
fast-hinted core formulas that don't have registered jets.
https://github.com/frodwith/urbit/tree/jit
Feedback and performance testing is welcome. Hopefully this is
useful
enough
to get integrated into master.
I did have to change the road structure, so this is probably a
breaching
change(?), unless someone knows how to use those mysterious
future-proof
fields to make it not...
--
You received this message because you are subscribed to the Google
Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it,
send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "urbit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to urbit-dev+***@googlegroups.com.
To post to this group, send email to urbit-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...