Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This project reminds me a lot of Dask https://dask.org/. A library that allows delayed calculation of complex dataframes in Python.


Hamilton is more similar to the Prosto data processing toolkit which also relies on column operations defined via Python functions:

https://github.com/asavinov/prosto

However, Prosto allows for data processing via column operations in many tables (implemented as pandas data frames) by providing a column-oriented equivalents for joins and groupby (hence it has no joins and no groupbys which are known to be quite difficult and require high expertise).

Prosto also provides Column-SQL which might be simpler and more natural in many use cases.

The whole approach is based on the concept-oriented model of data which makes functions first-class elements of the model as opposed to having only sets in the relational model.


Yes I can see how you can make that leap. Right now Hamilton isn't about scaling computation per se, more so about scaling a code base to handle featurization and people using that code base.

Hamilton wouldn't replace Dask. Instead Hamilton could use Dask, we even have a prototype PR to do so.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: