undefined | Better HN

0 pointsAthas8y ago0 comments

APL (and J by extension) are more tricky to parallelise than you might expect. The frequent reliance on boxing leads to irregular pointer structures, and the absence of compile-time type information makes it hard to generate code at all. APL is usually based on efficient implementations of primitives, but that is certainly too fine-grained to be sufficient for bandwidth-starved devices such as GPUs. I contributed to an APL-to-GPU compiler[0], and it was hard to make it work on more than a small (well-behaved) subset.

[0]: https://github.com/melsman/apltail

0 comments

15395830238y ago

Dyalog seem to have done amazing work on this in the last few years. Talk about Dyalog onging performance work https://video.dyalog.com/Dyalog16/?v=2AeONlTj1aY. Latest version performance info https://www.dyalog.com/dyalog/dyalog-versions/160/performanc...

j / k navigate · click thread line to collapse

0 pointsAthas8y ago0 comments

[0]: https://github.com/melsman/apltail

0 comments

15395830238y ago

j / k navigate · click thread line to collapse