cohost-dl but downloading a lot more data but less faithfully
website with precompiled binary downloads
- Post pages are not downloaded exactly as they appear on Cohost. Instead, this tool downloads only the data, and then re-creates something close to what they looked like on Cohost.
- Can download your own posts, your liked posts, your dashboard, and tag feeds
- Legal: using this software does not somehow grant you a license to re-publish posts and comments from other people
Usage Notes:
- You can interrupt this at any time, but if it’s doing something where there’s no progress bar, it’ll start over from page 1. This is probably annoying if you were on, like, page 200.
- I am not very good at SQL
Download stages:
- downloading posts
- downloading post comments
- downloading image and audio resources
Files:
- the database: stores all post data
- the output directory: stores all resources like images
- downloader-state.json: file to remember what’s already been downloaded before and skip downloading those things (can be edited)
Note: if you have used cohost-dl 2 before, you should probably run it again with the
try_fix_transparent_sharesoption.
- compile the post & markdown renderer. this is super jank. it currently requires running cohost-dl 1 as well
- if ASSC ever ships an open source post renderer, this will be replaced with that (if possible)
- if you don’t care about serve mode, just make an empty
md-render/dist/server-render.jsandmd-render/dist/client.jsfile so the Rust code compiles - in repo root:
rm out/staff/post/7611443-cohost-to-shut-down(if it exists)- why? because this post is used to determine the current Cohost version
./run.sh- wait for it to download Cohost version
a2ecdc59 - if this is no longer the current Cohost version, then the following build script will need an update
- wait for it to download Cohost version
cd db/md-render./build.sh
cargo run -- downloadcargo run -- serve(can be run in parallel)