BugmenotEncore
08/31/25 04:32PM
pre-alpha 5
UI: Barebones, but slightly less so.
Media/MediaTags/Comments/Notes: Functional, but buggy. No change since last time.
Wiki: Fully functional. No change since last time.
Tags: Fully functional.
Artists: Not implemented.
Pools: Fully functional.
Forum: Not implemented.

Progress report:

- Improved processing speed of tag and pool archivers.

Self-explanatory.

- Pool archiver now saves all previous states of each pool as well as the current ones.

With this, the pool archival process has all its necessary functions, at least until it's integrated with the rest.

- Pool archiver now correctly skips archival operations if it detects the local archive is up to date.

I thought this was implemented previously, but the code for it was a dud. The archiver kept working even though it already had everything. No wonder it was so slow!

- anonlv000 has joined the team!

Their primary role is R&D, while I will remain in charge of the busiwork. They have already been a lot of help, making note archival possible in the near future.

In the upcoming days, I hope to get the (for now, private) GitHub operational.
BugmenotEncore
09/03/25 08:08PM
pre-alpha 6
UI: Barebones, but slightly less so.
Media/MediaTags/Comments/Notes: Functional, but buggy. No change since last time.
Wiki: Fully functional.
Tags: Fully functional. No change since last time.
Artists: In development.
Pools: Fully functional.
Forum: Not implemented.


Progress Report

- The GitHub is up and running!

- Aliasing is now slightly less vomit-inducing to look at.

This is just for the benefit of the dev team, but I felt like bragging about it.

- Artist archival is...close to being possible.

Bugfixes in the pool archiver process are holding back these efforts. It does not quite run yet.

- Fixed an oversight that caused the local pool archive test to incorrectly request an update when the pool is empty.

This should save some time and bandwidth for followup runs.

- Yet more rare characters are being handled in the pool archival process. ugh.

Some absolutely hilarious fellows are putting fucking printer instructions into the pool descriptions. Because of course they are.
These are not visible on webpages. The only way to see them is putting them into a high end text editor.

The sole purpose of them appears to be giving the one person in the world who'd be looking at them that way a headache.

I am torn between throwing them out entirely, or keeping them on the basis that this much trolling has to hold Some artistic merit.

I'm taking a break to mull on this question.
R_of_Tetra
09/03/25 09:37PM
By the way, in the meantime, I took the liberty to download all the material and the metadata the site had (according to the Grabber I used) because, well,
1- Time's of essence
2- I wanted to see exactly how much space it would be necessary to store everything.

If the actual count for media (videos included) is 215.849 items (I started the download 4 days ago, roughly), then the final count, including one single .txt file with tags for each item, is 498 GB total.

I'll do a backup on an external drive tomorrow, just to be extra sure nothing gets lost in case, well, my rig decides to go "bzzzzzt" like it did a year ago (RIP my 7090).
BugmenotEncore
09/04/25 04:50AM
How does the data scraped by your method look, if you do not mind my asking?
Perhaps shoot me some screens, and the name of the thing?
R_of_Tetra
09/04/25 08:38PM
BugmenotEncore said:
How does the data scraped by your method look, if you do not mind my asking?
Perhaps shoot me some screens, and the name of the thing?


I'd say clean, but very basic. The media are at the resolution they were uploaded to the Hub, same resolution even for the biggest files out there (I tested with mine before starting the bulk), and the .txt files are just the tags, collected to match the image they saved. However, it doesn't seem to have saved the author's name for some reason (however, it's also possible I didn't input the proper commands).
It also didn't save the title, which was input by the author.

The tool I've used is this one github.com/Bionus/imgbrd-grabber.

I'll send you some screenshots once I come back from my business trip, can't do much else from my phone rn.

Korn333
09/06/25 03:52PM
I've used Grabber before too to have a local 80GB archive of the top rated images (>150 score) and some tags I enjoy, like Tetra it has everything at the original res, as well as artists tags and score at the time of scraping after some tweaking of the txt output.

I use Hydrus client to store them all in a booru style database though it misses out on pools, timestamps, and sorting by score is a bit of a pain, it also doesn't have the original uploader. Grabber has a lot of control over how metadata is output into the text and is a pretty decent scraper in the meantime while this is cooking.
__test__
09/13/25 04:25AM
This is anonlv000
Testing whitespace in usernames for archival purposes.
anonlv000
09/19/25 08:52AM
If anybody knows of a post with multiple pages of tag history, please speak up, I'd like to confirm a few things.

I know for certain that post #228280, at ~60 tag history entries, only has one page.
SpdWd
10/07/25 08:01AM
Any updates on this? Would I be able to run, let's say, e621 software with the hub's databases? If not, how painful would it be to get that to work? I'm very familiar with sysadmin. I kind of want to toy around with the idea of using my home server to run a local copy that has no AI and way better software... maybe expand that into a full blown VPS later to make an alternate Hub
BugmenotEncore
10/07/25 09:09AM
Things are still happening. Anonlv000 decided to spin off the entire project into Python due to their dislike of PowerScript. This set me back by a week of work and losing the help I'd thought I'd have discouraged me a great deal.

Hence the silence - I would not be very sociable about the updates. I get angry at the best of times.

Development is still ongoing, though it has been more sporadic, due to my not coming here to keep the update schedule honest, as was the original intent.

I can only speak to my version - You should be able to adapt the output to pretty much anything you want. Media is stored in the original format, and the rest is stored in plaintext.

Anonlv000's version is inherently more finicky, due to Python being Python, but I will hit them up to ask them, if they don't hop on in to answer you before we meet.
anonlv000
10/07/25 09:35AM
SpdWd said:
Would I be able to run, let's say, e621 software with the hub's databases? If not, how painful would it be to get that to work?


I'm using scrapy, so I'd be surprised if it'd work with e621 right out of the box, but I'd also be surprised if you couldn't adapt its output to your needs. I see from e621's GitHub that it uses some PL/pgSQL, and my output is structured as tables (you can configure scrapy to output them in formats such as CSV or JSONL), so I'm assuming you'd be able to load at least some of it into your database.

However, I'm not sure how much of Hypnohub's database schema matches e621's database schema. You'll have to adapt that yourself. I assume how painful that is depends on how comfortable you are with SQL.
BugmenotEncore
10/23/25 05:09AM
pre-alpha 7
UI: Barebones, but quite a bit less so.
Media/MediaTags/Comments/Notes: Functional, and a lot less buggy. Notes implemented. Tag integration ongoing.
Wiki: Fully functional.
Tags: Fully functional.
Artists: In development.
Pools: Fully functional.
Forum: Not implemented.

Progress report

- Media archival now supports single tag searches.

Turns out I find necessity a powerful motivator. With a good chunk of possession and corruption images being removed soon under the new rules, I threw myself into development, and have began to archive both tags.

The archive of them will be made accessible shortly after I regain access to my home computer.

- Media archival now supports notes.

Much the same story as above - this could save many translations that would otherwise be lost.

- Many minor, undocumented improvements.

The majority of them are fixes and efficiency overhauls to media page archival script. It is still not perfect, but it is Legible, and a far cry from how it began. That will help a great deal going forward.
BugmenotEncore
10/23/25 07:29PM
pre-alpha 8
UI: Getting there.
Media/MediaTags/Comments/Notes: Functional. Tag integration ongoing.
Wiki: Fully functional.
Tags: Fully functional.
Artists: In development.
Pools: Fully functional.
Forum: Not implemented.

Progress report

-Bugs eliminated

Kept the archive piling up until the computer ran hot and found zero problems. Feeling good.

- Compression integration and other conveniences

As larger and larger chunks of the site are backed up, managing the archive itself is becoming an issue to tackle. Planning to masssively improve the UI for the next update.

Until then, with great storage requirement comes great frustration - I've adapted some lossless compression tools I already had to better work with the archiver's output, improving them all the while.
<<<1 2


Reply | Forum Index