Beware that bad sectors probably exist on those, and modern filesystems (which falsely assume the hardware will do automagical remapping) make it hard to run `badblocks`.
From the fsck man pages, only ext2, ext3, ext4, and reiserfs support it. I hope that `btrfs-convert` deals with it correctly, but you must not ever run `btrfs balance` or delete the backup (it won't waste space if it was a fresh FS). And if further bad blocks develop you'll have to redo this all from scratch.
btrfs doesn't support bad blocks. It's written in the documentation.
From experience I can tell you that most usb sticks fail while reading back data by returning bad data, not an error or timeout, and the defective sectors aren't always the same (internal remap to wear leveling might be the cause of that).
Advice:
Do md-raid0 on half of them, md-raid0 on the other half, and btrfs raid1 on the two md devices.
If you want to exceed 25MB/s, use different USB controllers, not different ports on the same hub.
Btrfs doesn't directly support bad blocks. But since the conversion is supposed to use only "free space" that's presumably excluding bad blocks (which are presumably counted as allocated).
Bad blocks don't count as allocated unless they are newly discovered - data should be moved asap if it's still recoverable. An allocated block with nothing in it is an error.
I wanted to boot my home truenas box off usb flash, but I also know how flakey usb flash can be, so I use a pair of drives in a zfs mirror.
I would not recommend this setup, Any real drive would be better. I just hated wasting drive bays for the non-critical boot/system partition. In my defense despite burning through many cheap usb drives I have not lost the boot yet, however I do keep a new spare drive taped to the unit.
I had grand plans to make a giant USB ZFS volume from tradeshow thumb drives in bulk.
It lasted a couple evenings while I bumped into apparently every Linux USB driver/chipset issue possible trying to drive a couple hundred drives via cheapo AliExpress USB hubs.
It was fun times, but even more useless than I originally thought it would be. It was amusing to get >1GBps for a minute or two across 256 crappy thumb drives though!
From 2009, what I think was a famous video on which some kids in the UK put together a 6-gigabyte RAID with 24 256-gigabyte Samsung drives: https://youtu.be/26enkCzkJHQ
Is there any interest in bringing back that feature? No wizard here, but I dealt with Log4shell for our product. Reach out if you'd like someone to take a look.
You might want to re-examine prior assumptions: there are in fact companies very interested in pursuing rare diseases. This article[1] discusses why: they can be highly lucrative opportunities. Despite there being very few patients, if any treatment at all exists then many national health care systems or insurance companies are forced to pay for it. That article is ten years old but a basic web search for "world's most expensive drugs" will turn up a number of similar articles. Here's a more recent one[2] and it seems the prices have gone up, as have the size of the companies.
As for how to find/approach such companies, that's a tougher question. It's a great boon for a company to be able to get Orphan Drug designation[3][4], for which KS may qualify. Hurdles are lower and it can help them bring that treatment to approval.
One thought would be to examine company pipelines for drugs targeting this gene/pathway, even if it's for a different condition. For example, here's a press release[5] on a company with preclinical results for Sickle Cell Disease on a EHMT1 inhibitor (probably not what you need but at least related) and here's another[6] on a CDMO contracting to produce GLP protein for a client (it doesn't say who but at least indicates interest).
Another thought would be to look for companies targeting related conditions because at least they have the expertise and may even have a candidate (failed or active!) that could provide some benefit. The idea of a repurposed drug search is a good one if you can find something still under patent since you'll need someone with deep pockets to reach approval.
Note that these might not be pharmaceutical companies but biotechs instead, the distinction being small-molecule (chemical) vs. large-molecule (biologic), respectively, as it's much more difficult for others to make a generic copy of the latter. To that end, you might add monoclonal antibodies to your list of potential modalities.
Sorry, I have no clue on the funding question. You mentioned several other foundations that had success; I'd suggest reaching out to them for ideas and strategy if you haven't already. I do know that some states have Life Science initiatives of various kinds which might offer grants/funding and may have associated incubators and whatnot where smaller players can "band together" to get shared access to equipment, lab space, expertise, etc. Unfortunately I don't have a link for that at hand but can try to dig up something if it's useful.
Thank you! I don't see the companies as evil. I understand how profitable it is for Novartis to sell the SMA drug. But somehow they do not touch the whole thing until the drug is ready to use. I'll read the articles.
How about compressing the files while building the archive instead of compressing the entire archive itself? That ought to preserve reasonably fast reads. It won't give optimal compression but in practice should do alright (and of course will be smaller than no compression).
Developer mostly working in biotech, pharma, academic research. Open to FT but contract is an option.
Location: Tokyo Japan
Remote: Yes, or onsite for the right opportunity
Willing to relocate: Yes, for the right opportunity [US only]
Technologies: What do you need? Most of my heavy work is on a Java stack, but I've written major components in R, Python, Perl, whatever it takes. I do lots of data wrangling in R, bash, etc.
Great to see GraalJS is a realistic option. This polyglot VM idea is fascinating, though it's not likely to go anywhere for my personal use until GraalVM Python gets some traction. I check in on it now and then - every time it hits HN - but it seems to be going nowhere. Anyone have any experience with that?
What's the current status of Graal's Python implementation in terms of reaching actual usability? The README on its GitHub repo [1] doesn't inspire much confidence but whenever I check that wording hasn't changed.
I've been following Graal for quite some time, both as a former PLDI guy but also for my day job. I work in bioinformatics software (mostly cancer genomics research) and our group has a ton of (mostly legacy) code in Java and R, but most of the newly-minted grads coming in lean towards Python.
As one of the guys pulling all this stuff together, the Graal "polyglot" multilingual VM concept is of tremendous interest as you can imagine. It would be great to be able to package the legacy stuff interoperably with the new stuff no matter the language, even setting aside the bonus of better performance. But it has basically no practical use to us without Python (+ packages!) due to the direction and language inclinations of the group.
Is there anything new happening on that front? Or anything we could do to help it along? Is there more a detailed status page anywhere? Any sense of when this might land in a truly usable form, or what's the hold up?
I'm a bit surprised that the progress with R (with packages) is so far along but the progress with Python (with packages) seems stagnant (at least according to that README). No offense meant to the team, but that's the appearance. Is it the GIL?
[1] https://github.com/graalvm/graalpython, which calls it "early-stage experimental" and "very likely that any Python program that requires any packages at all will hit something unsupported".
I had decent luck with Nuitka[1] as long as the project is 100% python. The executables are large but have been mostly portable IME (some glib problems can arise though).
Largest project I compiled was only ~1000 lines but used external deps of pymysql, jinja2, ldap3 along with the stdlib's shutil, tempfile, pathlib, and the base os lib without issues. It takes ~30 minutes to compile on a decently powerful machine though (8650u and 32gb of ram). Most of this time was spent on pymysql and jinja2's compilation.
Thanks, but I don't see what that has to do with Graal or multi-language interoperability which is the key thing here. We have substantial code in Java, R and Python that could all benefit from being able to call one another from within the same process.
An alternative Python compiler by itself frankly buys us very little. Perhaps Jython, if it weren't targeting 2.7.x.