If you're interested in spreadsheets, don't miss sc, the Standard Unix Spreadsheet™ which nobody has ever heard of (the un-Googleable name doesn't help). It was originally written by James Gosling in the 1980s and is in Debian. It runs in a terminal and uses vi keybindings:
Indeed. In fact I liked it so much that I wrote a program very similar that can read/write such files on Windows[1]. I couldn't find how to compile the original on Windows.
The file format is pretty simple. At least for the spreadsheets I create.
As a side note, I love text-based (or extremely simple) file formats. So easy to write tools for them.
FWIW on Debian-like systems there is "apropos" that searches for binaries based on keywords in their descriptions. My problem is when I need it I can never remember the name, just spent 10 minutes search on variations of "advogato".
https://en.wikipedia.org/wiki/Apropos_%28Unix%29 : "Often a wrapper for the "man -k" command, the apropos command is used to search all manual pages for the string specified. This is often useful if one knows the action that is desired, but does not remember the exact command or page name."
TBH learning now that "man -k" exists (I don't think I can have "man man"-ed in the last decade) is probably a lot more useful for me.
There's nothing broken about that. If you want to do full text search, do `man -K` instead of `man -k`. E.g.
man -K MANOPT
Works for me on Ubuntu 14.04, though slowly so and I still have to ask the pager to search for it within the manpage which is opened. Because of this, I read the manpages when I know where to find what I'm looking for but otherwise just Google.
A far more severe problem IMO is the fact that many Linux distros ship without manpages in their default installs.
I downloaded the Makefile from the same site [1]. I changed CC to gcc (not sure if necessary) then just ran "make jarijyrki". The relevant build command makes it clear why your compile command fails.
I guess there was a limitation on the length on the compilation command too and the winner made use of it fully. You have to run it as "./jarijyrki < sheet1.info > myedits.info", sheet1.info is in the same site[2].
Not exactly: you don't get the O(1) access to the two previous computations with loeb. It abstracts the collection used for memoization, but it doesn't provides "locality".
Emacs comes with a couple of built in spreadsheets: ses-mode supports emacs lisp formula in cells, org-mode tables allow simple formulas in table cells. Furthermore, there are half a dozen or so other packages that users have written that are older or are more specialized in some way. See http://www.emacswiki.org/emacs/SpreadSheet.
All of these spreadsheets are, naturally, written in emacs lisp.
One of my good friends worked with some researcher to offload the work to a GPU ten years ago.
I have stumbled on massive spreadsheets in the wild where most of the cells needed to be recalculated when you changed something, and that could take a long time to recalculate.
It seems like most software out there today does this, which is pretty cool.
I wish there was a interface for data analysis that people could cut and paste any data into it and it outputs a magical spreadsheet that business people can use.
I once had to implement a tiny spreadsheet with cell references and simple ranged formula in Perl given 2 hours. Possibly the most interesting programming test I've had for a job yet.
You'll probably also end up learning a thing or two about JavaScript. Even now, 2 years after I first saw this, I'm seeing new things in it, and I've been doing JS for 20 years.
I hope you are joking. This isn't a spreadsheet, it's a stupid bruteforce algorithm that misses the point of the dataflow paradigm:
- Everything gets recomputed on any minor change.
- If a cell is referenced by N other cells, it will be recomputed N+1 times on every update. Worse, this can be exponential for A2=A1+A1, A3=A2+A2, which leads to A1 being recomputed 5 times. There's no sharing or memoization.
- Circular references are detected because they trigger a "too much recursion" exception.
- An impure cell "=Math.random()" will be observed with different values by its dependencies.
Not that I know of. It exploits some of JavaScript's weird scoping rules to keep everything very short. You're not going to get that in most other languages.
? It can be implemented in Python without any problems. Of the dynamic languages I know, at least PERL, Python, Ruby, Smalltalk, Lua, Io, Racket have support for something similar. And even if not directly supported, any other language with `eval` would probably be hackable enough to implement this. In TCL that's actually a common idiom.
Anyway, I'm not aware of any JS feature which wouldn't be easily replicated in most other dynamic languages. JS was rather unlucky when it comes to language features - its development stalled for years, while in the same time Python, Perl, Ruby, PHP evolved quickly. JS only now gets features implemented in other langs many years ago.
Incidentally, I have been playing around with the concept myself (client work) and it turns out you can hack a semi-working solution in just a handful lines of js:
I know a dirty Spreedsheet for browser. I first seen it in IRC, iirc. I was a greedy copycat to paste it into my site, for the case paste server goes down.
I've been thinking recently it would be nice to have an easy to work with open standard for spreadsheets - maybe based on JSON.
NB I know you can get Google spreadsheet data as JSON, but I was more thinking of something where you could have "headless" operation that could be embedded in other applications.
You can use OpenDocument Format files. These are supported by most spreadheet implementations. It comes in two variants:
1) a zip file with XML inside
2) a single XML file
Nevertheless, removing the "reads" seems more worthwhile than removing "observers". You want to model the data flow efficiently, not the data dependencies. The dependencies are provided indirectly in the code anyways.
Even if you manage to remove the explicit "reads" field, the "code" thunk is still going to point to the parent cells, so that wouldn't help the GC much.
Why do all the simple spreadsheet implementations forbid circular references? A colleague of mine made a very nice solver for temperature and pressure calculations in a transformer which would have been much more complicated without circular references.
Presumably a naive implementation would run the risk of having an infinite loop? Have you got any references of the theory proving them safe and/or removing the references?
I don't know if an arbitrary set of cell formulas could be proven to converge but I do know from experience that it is a useful technique. The trick is in setting the initial conditions; usually there will have to be one or more cells that contain a check for a wildly out of range value that will force the cell value into a reasonable range. Excel, and other spreadsheet programs, have counters that are used to halt circular calculations after an arbitrary limit. This is really no different than using a while loop to compute a value iteratively.
I feel there is an interesting bit (or example) missing: how does the (OCaml) code look that implements the expressions that are evaluated? The code returns a value and the list of cells it read - how is obtaining this list best organised?
Every time I read about reactive patterns, I hope that some data dependency or cells like behavior is included, and every time it's just some thin wrapper around basic pub/sub.
which basically allows to express interdependencies between data (a.k.a. formulas) in a way that updates sink variables when source variables are changed.
At the moment, I think not even Rx extensions for .NET (which I consider to be the most advanced reactive implementation in a mainstream language) do support this style of computation.
I did not take the approach of cells actually observing each other. Instead I had a recursive function that worked from the entered cell to:
-- parsed a string DSL of "=sum(B4:B8)" into a javascript function, that pulls some premade functions like sum, vlookup etc, and makes a string list representing what an arguments getter will need to fetch ("a_r3c1_r7c1" ie array, row 3, col 1 to row 7 col 1). The user still just sees "=sum(B4:B8)"
-- note who "I depend on",
-- note who "depends on me"
-- recurse 'outwards', skipping cells in "depends on me" that are still waiting on 'needs recalc' of their own "I depend on".
The update algorithm is not the hard part. Microsoft also has a very detailed documentation of their own update algorithm online (cant find it at the moment though).
The hardest part was updating the string representations of the formulas when you insert a new column or row, and then re-updating each cell's dependencies arrays.
Definitely not a performance problem, but more of a "how the fuck do I wrap my head around all the different ways someone could want to insert, cut, copy and paster here".
One mistake I made was trying to impliment the undo/redo to be totally reversable at every step. So every command stores the way to go both back and forward. In hindsight, I should have just stored forward commands and rebaked from the beginning when someone wanted to go back in time.
I wish I had more time to work on the project, but I've been busy with other work.
- A lot of people mentioning Handsontable.js. That library has some major design flaws. We used it in our git style version controlled spreadsheet app (http://www.gridhub.xyx). Handsontable only takes simple 'number' or 'string' value for each cell. It should have been an object that could store CSS for that cell, formula for that cell, and the value for that cell. I made a few pull requests, but they largely ignored me. The code is odd (they use labels, continue, and other C style code).Handsontable is great only for presenting tables. Its terrible for implementing a full excel clone.
Any programmer with much experience should be able to follow the structure of that article and translate the concepts to the programming language of their preference.
There is nothing language specific here, just think of the OCaml as pseudo-code.
Totally of topic to spreadsheets but why oh why does blogspot still change the domain based on ip? (Yes, I know that it is related to selectively blocking certain blogs in certain areas but there has to be a better way to do this, seriously.)
http://www.linuxjournal.com/article/10699