noughtme's favorites | Hacker News

nerdponx on Jan 31, 2024 | parent | context | on: Executing Cron Scripts Reliably at Scale

Coming from the "data world", how does a tool like Airflow/Dagster/Prefect differ from these?

noman-land on July 6, 2023 | parent | context | on: Cloud Backed SQLite

I've just been exploring serving large SQLite databases in chunks and querying them with http range requests to prevent downloading the entire database. It's pretty awesome!

I found a really interesting library called sql.js-httpvfs[0] that does pretty much all the work. I chunked up my 350Mb sqlite db into 43 x 8Mb pieces with the included script and uploaded them with my static files to GitHub, which gets deployed via GitHub Pages.[1]

It's in the very rough early stages but you can check it out here.

https://transcript.fish

I recommend going into the console and network tab to see it in action. It's impressively quick and I haven't even fine-tuned it at all yet. SQLite rules.

[0] https://github.com/phiresky/sql.js-httpvfs

[1] https://github.com/noman-land/transcript.fish

skydhash 8 months ago | parent | context | on: LLM-powered tools amplify developer capabilities r...

> Traditionally, coding involves three distinct “time buckets”:

> Why am I doing this? Understanding the business problem and value

> What do I need to do? Designing the solution conceptually

> How am I going to do it? Actually writing the code

> For decades, that last bucket consumed enormous amounts of our time. We’d spend hours, days or weeks writing, debugging, and refining. With Claude, that time cost has plummeted to nearly zero.

That last part is actually the easiest, and if you're spending inordinate amount of time there, that usually means the first two were not done well or you're not familiar with the tooling (language, library, IDE, test runner,...).

There's some drudgery involved in manual code editing (renaming variable, extracting functions,...) but those are already solved in many languages with IDEs and indexers that automate them. And so many editors have programmable snippets support. I can genuinely say in all of my programming projects, I spent more time understanding the problem than writing code. I even spent more time reading libraries code than writing my own.

The few roadblocks I have when writing code was solved by configuring my editor.

archerx 9 months ago | parent | context | on: Gemma3 – The current strongest model that fits on ...

I have tried a lot of local models. I have 656GB of them on my computer so I have experience with a diverse array of LLMs. Gemma has been nothing to write home about and has been disappointing every single time I have used it.

Models that are worth writing home about are;

EXAONE-3.5-7.8B-Instruct - It was excellent at taking podcast transcriptions and generating show notes and summaries.

Rocinante-12B-v2i - Fun for stories and D&D

Qwen2.5-Coder-14B-Instruct - Good for simple coding tasks

OpenThinker-7B - Good and fast reasoning

The Deepseek destills - Able to handle more complex task while still being fast

DeepHermes-3-Llama-3-8B - A really good vLLM

Medical-Llama3-v2 - Very interesting but be careful

Plus more but not Gemma.

blowski on July 15, 2021 | parent | context | on: SQLBolt – Interactive lessons and exercises to lea...

My favorite book for learning SQL is “The Art of PostgreSQL”. https://theartofpostgresql.com/

I found the combination of real-world problems, general SQL advice, and the broad range of topics to be a really good book. It took my SQL from “the database is not much more than a place to persist application data” to “the application is not much more than a way to match commands to the database”. It’s amazing how much bespoke code is doing a job the database can do for you in a couple of lines.

tomcam on Oct 18, 2024 | parent | context | on: Tubeworms live around deep-sea vents

Turbo Pascal, Porsche 911, Mtn Dew Throwback, Scarlett Johansson

Just spitballing, haven’t read the book

minimaxir on Jan 13, 2024 | parent | context | on: Generalized K-Means Clustering

I built a pipeline to automatically cluster and visualize large amounts of text documents in a completely unsupervised manner:

- Embed all the text documents.

- Project to 2D using UMAP which also creates its own emergent "clusters".

- Use k-means clustering with a high cluster count depending on dataset size.

- Feed the ChatGPT API ~10 examples from each cluster and ask it to provide a concise label for the cluster.

- Bonus: Use DBSCAN to identify arbitrary subclusters within each cluster.

It is extremely effective and I have a theoetical implementation of a more practical use case to use said UMAP dimensionality reduction for better inference. There is evidence that current popular text embedding models (e.g. OpenAI ada, which outputs 1536D embeddings) are way too big for most use cases and could be giving poorly specified results for embedding similarity as a result, in addition to higher costs for the entire pipeline.

teruakohatu on April 8, 2024 | parent | context | on: Llm.c – LLM training in simple, pure C/CUDA

Yes general LLM models can be used for time series forecasting:

https://github.com/KimMeen/Time-LLM

hayst4ck on March 27, 2024 | parent | context | on: Boeing chief must have engineering background, Emi...

I've started to bring up Admiral Rickover's speech Doing a Job in all of the Boeing threads because he is just so relevant. Admiral Rickover was the man responsible for America having nuclear submarines. https://govleaders.org/rickover.htm

The speech is well worth a read in its entirety and it feels prescient in regards to Boeing. I think this paragraph more than any other hits at the core problem at Boeing:

> Unless the individual truly responsible can be identified when something goes wrong, no one has really been responsible. With the advent of modern management theories it is becoming common for organizations to deal with problems in a collective manner, by dividing programs into subprograms, with no one left responsible for the entire effort. There is also the tendency to establish more and more levels of management, on the theory that this gives better control. These are but different forms of shared responsibility, which easily lead to no one being responsible—a problems that often inheres in large corporations as well as in the Defense Department.

To contrast here is a statement from Calhoun: “We caused the problem. And we understand that. Over these last few weeks, I've had tough conversations with our customers, with our regulators, congressional leaders, and more. We understand why they are angry, and we will work to earn their confidence,” Calhoun said.

That we is him failing to take personal responsibility and choosing instead to spread responsibility to all employees, making no one responsible for the state of Boeing.

Boeing needs a leader who will take personal responsibility.

CSMastermind on Feb 9, 2024 | parent | context | on: Thoughts on tech employment

There are some obvious things missing from here.

I believe the prevailing wisdom is that tech hiring slowed because interest rates rose.

Because software scales so well, it benefits from speculative effort more than other business types. We see this in venture capital, where they only need 1 out of 100 bets to hit in order to make their money. Large tech companies do something similar internally. They may fund the development of 100 products or features, knowing they only need one of them to hit big in order to fund the company going forward.

When money was essentially free to borrow, it made all the sense in the world to make a large number of bets because the odds were on your side that at least one of them would pay off. Now, however, each bet comes with a real opportunity cost, so companies are making fewer speculative bets and thus need fewer people.

---

The other thing he doesn't talk about is the rise of remote work and the downward pressure that it puts on wages. I know that many companies are forcing employees to return to the office, but I'd speculate that the number of remote workers has risen significantly. And that opens up the labor market pretty significantly.

I'll tell you that I'm getting overseas talent for roles where 10 years ago I would have hired entry level talent in the US. But since my company is fully remote and distributed, the downside to hiring in LatAm and Eastern Europe has been significantly reduced.

mtlynch on Feb 9, 2024 | parent | context | on: How I write HTTP services in Go after 13 years

>The Valid method takes a context (which is optional but has been useful for me in the past) and returns a map. If there is a problem with a field, its name is used as the key, and a human-readable explanation of the issue is set as the value.

I used to do this, but ever since reading Lexi Lambda's "Parse, Don't Validate," [0] I've found validators to be much more error-prone than leveraging Go's built-in type checker.

For example, imagine you wanted to defend against the user picking an illegal username. Like you want to make sure the user can't ever specify a username with angle brackets in it.

With the Validator approach, you have to remember to call the validator on 100% of code paths where the username value comes from an untrusted source.

Instead of using a validator, you can do this:

    type Username struct {
      value string
    }

    func NewUsername(username string) (Username, error) {
      // Validate the username adheres to our schema.
      ...

      return Username{username}
    }

That guarantees that you can never forget to validate the username through any codepath. If you have a Username object, you know that it was validated because there was no other way to create the object.

[0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

solatic on Feb 6, 2024 | parent | context | on: OKRs Are Bullshit

People will never be motivated to go the extra mile by a standardized, bureaucratized process. It's not a problem specifically with OKRs, it's a problem with the whole concept that if HR can just put in this one simple system then doing so will be magically motivational and the whole company will go to ludicrous speed.

There is no replacement for good people. Not in leadership positions, and not in IC positions. Recruit for strengths, hire for culture, train for gaps. No process, least of which OKRs, can make up for recruiting weak people, people who don't fit your culture, or people not interested in personal growth (i.e. filling gaps).

sgt-henderson on Jan 30, 2024 | parent | context | on: I need to grow away from these roots

There's some fun music theory lurking inside this project. It turns out that every transition from one chord to another via the algorithm described is either from one chord to itself (e.g., C major to an inversion of itself or adding/removing a 7th), or one of the three basic Neo-Riemannian transformations: P, L, or R.

Go check out the Wikipedia page on Neo-Riemannian theory for more details, but here are a few key facts about P, L, and R: For any chord x, then if x is major, then P(x), L(x), and R(x) are all minor. If x is minor, then P(x), L(x) and R(x) are all major. P, L, and R are all inverses of themselves, so that P(P(x)) = x, and so on for L and R. It's possible to reach any major or minor chord from any other major or minor chord by some sequence of P, L, and R transformations. For example, from C major, applying L then R gets you to G major; applying R then P gets you to A major; and applying L then P gets you to E major.

Hopping around via Neo-Riemannian transformations are a quick way to use smooth voice leading to get to a "remote" key center (i.e., one that doesn't have many scale tones in common with the key you started in), but I was surprised when listening to the piece how (relatively) stable the harmony seemed. What's interesting here is that because of the way the algorithm is constructed, P transforms are much less common than L or R transforms (or just staying with the same chord) -- and crucially, P transforms are a vital ingredient in quickly moving to remote keys. By my rough calculations (which assume the Markov process has reached steady state and ignore the limits on min/max pitch), only 1/27th of all chord changes are P transforms. It also turns out that in steady state, 7th chords are more common than simple triads by a ratio of 16:11.

feross on Sept 6, 2018 | parent | context | on: Show HN: BitMidi – Wayback machine for old-school ...

A bit more info about how BitMidi is built...

First step, I needed to build a MIDI player in JavaScript. At first, I was determinted to write one from scratch in JavaScript and use the Web Audio API to synthesize all the instruments in code. I thought this would yield the the smallest possible JavaScript file size.

However, I didn’t really have the audio engineering skills to pull this off. So ended up settling for an approach that uses SoundFonts, which are basically instrument voices or files that contain all the possible notes an instrument can play.

BitMidi uses the instrument voices from the General MIDI sound set released by FreePats.

Then I compiled the best MIDI player written in C (libtimidity) to WebAssembly using Emscripten. I put in lots of effort to optimize the built size and include the minimal amount of code. The result of my efforts are available in the npm package timidity (https://github.com/feross/timidity). It’s quite lightweight – just 34 KB of JavaScript and 23 KB of lazy-loaded WebAssembly, smaller than anything I’ve seen on any other site.

Then I put a frontend on it, so it’s easy to browse all the files. BitMidi uses all the best techniques that I know about to make it super fast and snappy. The site gets perfect 100s on all categories on Chrome’s Lighthouse Performance benchmark, which is extremely non-trivial in my experience.

I plan to ingest a lot more MIDI files in the future, from sources like the Geocities MIDI archive on the Internet Archive and elsewhere.

reactordev on Sept 6, 2023 | parent | context | on: Show HN: Open-source Postman alternative with type...

Have you seen hoppscotch[1]? They have been my postman alternative for quite some time now.

[1] https://hoppscotch.io

rob74 on Aug 1, 2023 | parent | context | on: Uber posts first quarterly net profit

There's a recently coined word for that: https://knowyourmeme.com/memes/enshittification

ipcress_file on Aug 1, 2023 | parent | context | on: The Parts-Bin Approach: Konami’s Contra

Even as a teenager the anti-Sandinista politics of this game were clear: https://killscreen.com/previously/articles/the-forgotten-pol...

Contra was an early experience that made the propaganda of everyday life visible to me.

ChuckMcM on July 31, 2023 | parent | context | on: Ice core scientists in East Greenland reach bedroc...

I have moved all my "matlab like" work (mostly signal processing) into Octave for that same reason.

anigbrowl on July 31, 2023 | parent | context | on: Ice core scientists in East Greenland reach bedroc...

A lot of modeling is moving toward Julia, so if you don't want to give money to Mathworks here are some alternatives: https://juliaclimate.github.io/Notebooks/

zdw on May 21, 2023 | parent | context | on: The end of the accounting search

I'm somewhat surprised that GNUCash was picked over Ledger or another of the plaintext systems out there:

https://plaintextaccounting.org

Maybe this was primarily done based on the users being comfortable with a GUI driven system?

SPBS on May 8, 2023 | parent | context | on: How to design software architecture for startups

I was expecting something like this article https://alexkrupp.typepad.com/sensemaking/2021/06/django-for... (discussed at the time https://news.ycombinator.com/item?id=27605052) but was greeted with something far more abstract and less helpful. Look at the difference in quality of their table of contents!

    - What does our app do?
    - Microservices usually don't work well for startups
    - Move fast and outsource things
    - Consider building reusable things
    - Be pragmatic
    - Boundaries along sync/async communication patterns
    - How we did it and how we would do it next time
    - About flexibility

    - Predictability
        - Rule #1: Every endpoint should tell a story
        - Rule #2: Keep business logic in services
        - Rule #3: Make services the locus of reusability
        - Rule #4: Always sanitize user input, sometimes save raw input, always escape output
        - Rule #5: Don't split files by default & never split your URLs file
    - Readability
        - Rule #6: Each variable's type or kind should be obvious from its name
        - Rule #7: Assign unique names to files, classes, and functions
        - Rule #8: Avoid *args and **kwargs in user code
        - Rule #9: Use functions, not classes
        - Rule #10: There are exactly 4 types of errors
    - Simplicity
        - Rule #11: URL parameters are a scam
        - Rule #12: Write tests. Not too many. Mostly integration.
        - Rule #13: Treat unit tests as a specialist tool
        - Rule #14: Use serializers responsibly, or not at all
        - Rule #15: Write admin functionality as API endpoints
    - Upgradability
        - Rule #16: Your app lives until your dependencies die
        - Rule #17: Keep logic out of the front end
        - Rule #18: Don't break core dependencies
    - Why make coding easier?
        - Velocity
        - Optionality
        - Security
        - Diversity

BiteCode_dev on May 3, 2023 | parent | context | on: JavaScript import maps are now supported cross-bro...

Rust is the golden standard. python and js are on part with each other, they just have different problems:

- tooling and dependencies are hell in js

- bootstrapping is half the battle in python (see https://bitecode.substack.com/p/relieving-your-python-packag...)

dsubburam on April 26, 2023 | parent | context | on: Transformers from Scratch (2021)

An early explainer of transformers, which is a quicker read, that I found very useful when they were still new to me, is The Illustrated Transformer[1], by Jay Alammar.

A more recent academic but high-level explanation of transformers, very good for detail on the different flow flavors (e.g. encoder-decoder vs decoder only), is Formal Algorithms for Transformers[2], from DeepMind.

[1] https://jalammar.github.io/illustrated-transformer/ [2] https://arxiv.org/abs/2207.09238

di4na on April 5, 2023 | parent | context | on: Calling Purgatory from Heaven: Binding to Rust in ...

What happened are multiple factors

1. Web3 hired a lot of these people and so they had less time to work on this stuff. Shame to spend that much on a dead end but eh

2. Scala died with Big Data. It is still around and all but noone care anymore, which emptied the room. It also happened that the whole Implicits experiment for polymorphism, which scala was really supposed to explore, did not pan out that well

3. Effects progressed but... Mostly out of view. Ocaml shipped them with its multicore, we are seeing good work on the academic side, you see Verse wanting them, etc. Same thing with linear types.

4. Dependent types ... Never really crossed to the realm of production. And Idris and co are mostly "complete" so it slowed down

5. Oh and monad interest, mostly fueled by scala, died slowly. Effect handlers seems to be a nicer solution in practice to most of this stuff.

6. Typescript killed a lot of the need for advanced stuff, same with python and ruby shipping their stuff too. Meanwhile Rust and Elixir showed you did not need the really up there stuff to have results in prod.

In the end what happened is that a lot of the highly abstract stuff was driven by "hype domain" that died, while more pragmatic but limited implementation burgeoned and absorbed some of them. Rubber met the road and that tampered a lot of people down.

There is still work being done, but rn it is more at the "experimental language" stage. Think Rust in the mid 00s.

Oh and Rust mindshare is still growing. A lot. A looooot.

joppy on Dec 5, 2019 | parent | context | on: An Overview of the Monad

There’s an incredibly straightforward and readable paper by Simon Peyton Jones (one of the creators of Haskell and GHC) which explains how Haskell deals with IO, exceptions, and concurrency. It also explains why they settled on this, rather than some other design. In my opinion, it is the best explanation of the IO monad (specifically IO) out there. Even just reading the first 10 or so pages is completely worthwhile.

Paper: https://www.microsoft.com/en-us/research/wp-content/uploads/...

jakubmazanec on March 9, 2023 | parent | context | on: JavaScript and TypeScript features of the last 3 y...

I love this thread, it has two of my favorite HN topics: 1) People shitting on JavaScript not realizing that their "obviously better" solution was considered and found not a good solution. 2) People shitting on TypeScript not realizing that conditional types and template literal types are awesome. I really like those type-safe routers (https://tanstack.com/router/v1/docs/guide/type-safety) and fully-typed database clients (https://www.edgedb.com/docs/clients/js/index#the-query-build...).

satvikpendem on Jan 11, 2023 | parent | context | on: Don't use Tailwind for a design system (2021)

I've used Tailwind extensively at previous companies and inevitably each one creates an abstraction that's akin to:

    const headerClasses = [(list of Tailwind classes here)];

    <header className={...headerClasses}>...</header>

because the complexity of reading and writing all of the classes is just too much. At that point, you've just reinvented CSS classes. Tailwind fans will tell you to not do this but if multiple companies are independently having the same problem and coming up with the same solution, the onus is not on the user anymore, it's on the creator to fix it. @apply can work but again it's really not recommended by Tailwind itself, for whatever reason.

These days I recommend learning CSS really well and then using Vanilla Extract (https://vanilla-extract.style), a CSS in TypeScript library that compiles down to raw CSS, basically using TS as your preprocessor instead of SCSS. For dynamic styles, they have an optional small runtime.

They have a Stitches-like API called Recipes that's phenomenal as well, especially for design systems, you can define your variants and what CSS needs to be applied for each one, so you can map your design components 1-to-1 with your code:

import { recipe } from '@vanilla-extract/recipes';

export const button = recipe({ base: { borderRadius: 6 },

  variants: {
    color: {
      neutral: { background: 'whitesmoke' },
      brand: { background: 'blueviolet' },
      accent: { background: 'slateblue' }
    },
    size: {
      small: { padding: 12 },
      medium: { padding: 16 },
      large: { padding: 24 }
    },
    rounded: {
      true: { borderRadius: 999 }
    }
  },

  // Applied when multiple variants are set at once
  compoundVariants: [
    {
      variants: {
        color: 'neutral',
        size: 'large'
      },
      style: {
        background: 'ghostwhite'
      }
    }
  ],

  defaultVariants: {
    color: 'accent',
    size: 'medium'
  }

});

Impressive that OP still is using ReasonReact, I thought it was all but dead after ReScript.

sctb on Feb 26, 2023 | parent | context | on: An overview of modern Japanese wood construction (...

There is a Japanese carpenter with 50 years of experience on YouTube who documents his construction of entire homes, which I would recommend to anyone who is interested in modern Japanese wood construction:

https://www.youtube.com/channel/UCdf5QHEpfrg3KydeT3iD2IQ (English)

https://www.youtube.com/channel/UCdrVc2ByfvnNW14R6o_WpkA (Japanese)

TobTobXX on Feb 12, 2023 | parent | context | on: GitHub is aggressively caching raw.github, breakin...

There are two hard problems in IT: cache invalidation, naming things and off-by-one errors.

_skel on Feb 7, 2023 | parent | context | on: All programming philosophies are about state

The article is grouping things together that don't belong in the same categories.

OO, functional, imperative, declarative: these are ways of controlling dispatch.

Monoliths and microservices are both ways to organize codebases and teams of programmers and control whether dispatch is intermediated by the network or not. Either way, both of these options are implemented by some kind of language in the previous category (OO, functional, imperative, or declarative).

Service-oriented architecture applies to both monoliths and microservices, and very few programmers still working in the industry have really seen what an alternative to service-oriented architecture actually looks like.