Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In compilers, there's been a recent uptick in industry research and adoption into using equivalency classes and graphs (egraphs) for doing optimization passes in a way that preserves information and solves the phase ordering problem. Egg[0] was one of the big libraries that came out of it, but can only handle term rewriting without guards, and so can't express y/x*x -> y because it's unsound when x=0, or optimization passes that are across control flow points and thus some of the eclasses are only valid in some locations. The Cranelift developers adapted it into a construction they call aegraphs[1] that handles this, and are migrating Cranelift to use these passes for all optimizations, which is (hopefully) faster and achieves more optimization opportunitie; Chris Fallin is presenting about their aegraph work at PLDI this year.

0: https://egraphs-good.github.io/

1: https://github.com/cfallin/rfcs/blob/main/accepted/cranelift...



The other recent research that's been interesting to watch has been in logic programming (Prolog and Datalog), where there's been a lot of papers extending Datalog with semilattice types that are more efficient for computing dataflow-type queries (and even getting lattice types added to Souffle recently), including papers like datafun[0] extending it to sound incremental queries for more efficient execution - and there's even a paper by the Egg authors using it for efficient eclass-based Datalog queries using lattices[1]. It also seems like there has been more work recently in actually integrating logic programming in existing languages and programs, like ascent[2], instead of it being off in it's own world like it seems has historically been the case.

0: http://www.rntz.net/files/seminaive-datafun.pdf

1: https://www.mwillsey.com/papers/egglog

2: https://s-arash.github.io/ascent/cc22main-p95-seamless-deduc...


How does Datalog with e-graphs compare to Datalog with BDDs in terms of what analyses can be performed? bddbddb is a Datalog engine using BDDs, that, according to their 2005 paper[0], can solve problems like context-sensitive pointer analysis for large programs and analyses implemented with it are faster than hand-tuned implementations in dramatically fewer lines of code. I'm not knowledgeable on what modern Datalog engines are based on.

0: https://people.csail.mit.edu/mcarbin/papers/aplas05.pdf


I'm under the impression that BDDs are good for more efficient representation of row membership for values, so that you can query and saturate more facts faster, but eclasses allow for writing rules that "automatically" have transitive connectivity of two rules. If you have X -> Y and Y -> Z, you only need to store one of X or Y :- Z since X and Y have been unioned together and may be treated identically, along with a same(X, Z) rule firing instead of having to have additional transitive same(X, same(Y, Z)) rules. I don't profess to be an expert in BDDs, eqsat, or Datalog, though! I have one friend that did a thesis on eqsat who wasn't that big of a fan of bddbddb but I don't remember exactly why.


Yeah, it's really unfortunate that these amazing advances in datalog don't make it into general purpose languages.


You might be interested in Flix which has first-class Datalog program values:

https://flix.dev/

https://doc.flix.dev/fixpoints.html

(I am one of the developers of Flix)


The documentation could use some proof reading. Grammar etc.

> Flix is strongly typed. Any attempt to use predicate symbol with terms of the wrong type (or with the wrong arity) is caught by the type checker.

*Statically typed


The feature list is impressive


I was about to post this. If only the vscode plugin worked I would use Flix.


What's the issue? Bug reports are most welcome.


The latest version seems to have solved the issue. At least the compiler plugin runs with an empty file now.


Mainstream languages will lag behind others forever. The average programmer in Java or C# neither cares for nor understands 'new' features or concepts. Most people I know in this area have never even heard about prolog, lisp, smalltalk, etc.


The recent work on relational, datalog-inspired egraphs in PLDI this year ("Unifying Datalog and Equality Saturation") is actually interesting because it can solve cases like the y/x*x -> y identity example, by the power of an interval analysis on x (among other things.) Sort of like adding a postulate to a rewrite term and then proving it. But instead it's by adding relations between terms in the graph. See section 6.2 of the paper.

https://github.com/egraphs-good/egglog

https://arxiv.org/pdf/2304.04332.pdf


thanks, this is cool!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: