2021-01-21: Planned server maintenance on the s2 full image servers has concluded successfully, with 72TB of hard drives added. Donations to the archive would be appreciated to help fund our server hardware & storage drives.   New archive software is currently under development, details here.

## Developer No.4019 View ViewReplyLast 50ReportDelete
BTW if anyone knows any 4chan admins, please tell them to unblock or increase the Cloudflare limits on the desuarchive/rbt.asia and archived.moe scrapers. The method used to hold us over for the past year does not seem to be working well anymore and it is failing to archive threads periodically.

Relevant actions taken by archived.moe admins as FYI, we are also using the same temporary fix: https://archived.moe/talk/thread/14/#q141_200

This reveals that our time is up with the FoolFuuka/Asagi archiver stack. It has not had active development since the collapse of Archive.moe in 2016. It is too inefficient with bandwidth to function any further. It is not worth feeding both with additional RAM and resources if it is too inefficient with requests as well, the stack is already consuming 168GB RAM as it is. It is only a matter of time before the other archivers meet this fate too.

This is an unexpected development and it could take weeks if not months, but the C# .NET developer of hayden is working hard on it with massive reductions and huge improvements in efficiency and request limit compliance, such as RAM use reduction to 100MB, but his time is limited, we wish we could have bought some more time.

To anyone who can help, call up all C# .NET developers and skilled MySQL/Percona DBAs to try and bring the Hayden code up to scratch as a suitable drop in Asagi replacement. Once the scraper is replaced, we can also work together with Python developers to build a new frontend replacement for FoolFuuka as described in the previous thread. It is already demonstrably more efficient and accurate, however it is not fully tested and often hits deadlock issues. Using Hayden will allow us to consolidate our operations on s2.desuarchive.org instead of s1.desuarchive.org on a separate continent, immediately saving $90 a month and removing a Sword of Damocles by eliminating s1 which the previous admin (peace be upon his wrists) is likely to default on due to crippling medical bills.

There are some download reliability issues that need testing to be able to use in production, but we wish we had some more time before this happened.


In the future we would like to totally overhaul the FoolFuuka/Asagi stack with a total replacement for the sole use of Desuarchive, but a drop in Asagi replacement is crucial as the vast majority of other archiver admins such as archived.moe have lives to live and are unlikely to take up any SQL schema changes.

Feel free to drop by the chat to help brainstorm what can be done.



It is helpful to read our previous thread on the topic as well: https://desuarchive.org/desu/thread/3894/

EDIT: Modified to clarify that FoolFuuka/Asagi, both the frontend, backend, and inefficient MySQL schema, are all to blame for the situation.

This post was modified by Desuarchive Administrator on 2019-12-07
66 posts and 2 images omitted

## Developer No.3894 View ViewReplyLast 50ReportDelete
## TL;DR Regarding the Absolute State of 4chan Archival

At Desuarchive we have long struggled with many issues with many unsolved mysteries from the previous admin (peace be upon his wrists), but we have now set up the archiver on a more stable footing and there is some development going on with the scraper at least, so things are looking up as you may have seen this past year.

It is imperative for the survival of all 4chan archivers that Java-based Asagi is replaced (especially given the downfall of Fireden) and significant efficiency improvements are made in both excessive HTTP requests and RAM usage, while providing the same reliability and accuracy. As 4chan grows all archivers will be in grave danger of dying under the strain of deep software inefficiency and unsustainable costs if this is not done.

Archives are not going to be sustainable as seen with Fireden if only one dude has to shoulder the weight of thousands of dollars of equipment and bandwidth usage. The next archiver on the deathwatch appears to be Warosu.

On behalf of all 4chan archives, we need your help with the two scrapers being developed. These scrapers are currently set to be Asagi compatible as a future drop-in replacement for all other archivers: which is no small feat as Asagi regularly uses 40-60GB of RAM at full load but these could use as low as 30-150MB.

https://github.com/bibanon/eve - Python based scraper. We currently actively test it in production with /wsg/ scraping.

https://github.com/bbepis/Hayden - C# .NET light scraper, still needs testing for evaluation. But it's doing real great.

As such, we hope to be able to build a brand new archival stack based on these that dispenses with the inefficiencies of the scrapers of the past using PostgreSQL JSONB to store threads exactly as they are from the 4chan API (NoSQL style). While we are not frontend developers, we can sidestep this by building middleware to emit a 4chan compatible API, so that 4chan-X can be used as the JavaScript webapp and Android apps (Chanu, Clover) and iPhone apps could be modified with a few lines of code to work with the archive.

In support of research and onboarding for this, this effort 4plebs has generously developed partial 4chan API compatibility for the FoolFuuka frontend which is slowly being rolled out. This will allow Android and iPhone applications to view the FoolFuuka archivers (but not ghostpost yet). If you are a PHP developer we need your help here.


They also developed 4plebs X which uses 4chan-X to function as a webapp frontend, possibly able utilize this 4chan API to replace the user facing part of the PHP HHVM FoolFuuka stack with a familiar alternative. It has flaws such as the lack of search and ghostposting, but hopefully developers could try to step in regarding that.


Demo: https://test.4plebs.org (to use disable 4chan-X to avoid conflicts).

If you know any third party 4chan app devs, please refer them to us so we can direct them on how to set the proper configurations for their app to access FoolFuuka archives. (there was an old FoolFuuka API already but it predates 4chan API so it is not directly compatible, best to move off it)

We are willing to provide support and troubleshooting for better understanding of FoolFuuka/Asagi instances for the construction of new ones or development of replacement scrapers, or if anyone wants to pick up the boards of Fireden. We have institutional knowledge and experience running many major archival websites gathered over 2 years, so don't hesitate to drop by.



Our guide could use some work but it will guide you there with some hiccups.


## Regarding the Absolute State of Fireden and /v/ and /vg/ archival

Fireden is infamous in the community for never reaching out for help or advice, and never acting on anything other than abuse emails. I don't think they ever planned to operate for this long they were set up on the whim in 2015 after archive.moe died, so they probably just had enough it costs a lot to operate a site that can scrape /v/ and /vg/ images. But if the Fireden admin is reading this, be the prodigial son: we can provide any assistance or backup you need so that your hard work is not in vain.

The next archiver I expect to collapse under pressure is Warosu. As for us we are pretty stable after a $500 chassis upgrade and hot spare SSDs, but it really sucks to be one of the few people in the world who puts a large amount of capital into 4chan archival.

We refuse to pump more money in to bail out any more archivers for barely any returns, We have had to bail out 4 of them already and have paid $7000 to date out of pocket, and $200 a month, can't someone else pony up?

The best bet is for a large capital investment to be made on arch.b4k.co so it can be significantly upgraded to our standards to match the levels of Fireden, providing /vg/ scraping and full images for both. It will actually not cost too much to start out with maybe only 5x10TB drives for $700, $300 for a new case and maybe $600 for new AMD Ryzen with 160GB of RAM for Asagi and MySQL and $100 for colocation. Because we will probably never see the fireden images ever again, so that saves a lot of space.

4plebs refuses to take on any more boards as they are barely able to handle the ones they have.

## Basic Details about the Maintenance Done

This weekend we managed to do a major case upgrade for $500 for our backend image server to allow it to host more services such as scrapers and frontend content. All SSDs were moved out of the internal bay and into hotswap bays, and a hot spare SSD for booting was added: without those it was really difficult to service and made it difficult to consider using it for hosting databases safely. It may be possible to attach at least 6 more 3.5" drives which will be necessary as only 10TB of storage is available.

This may make it possible to halve the costs of cloud servers and bandwidth that we currently use by consolidating service together into a single server.

1 drive with bad sectors was replaced safely for $150 and a ZFS resilver completed. The other drives do not appear to have issues, but we continue to monitor the situation.

Tests done with the bibanon/eve scraper for scraping /wsg/ have been extremely promising, though development is still ongoing to put it on par with the Asagi scraper. It is possible that any new deployment of the scraper will utilize either this or hayden, but proper testing will still be necessary.
94 posts and 2 images omitted

## Admin No.3026 View ViewReplyReportDelete
Welcome to /desu/. Use this board to report issues, request features, and for other discussions regarding desuarchive.org & rbt.asia. Other posts will be removed.

When reporting a technical issue, be sure to include the full URL of the page/image.

Do not use this board for removal requests, which must be emailed to [email protected] Other rule violations can be reported by clicking the "Report" button on the post.

No.4468 View ViewReplyReportDelete
is it just on my side or has the archiving of new posts/thread stopped like 4 hours ago?

No.4454 View ViewReplyReportDelete
How to search for special characters like "[":
Search -> "[chair"
That will include these characters?
Also is there a way to force case sensitivity?

No.4435 View ViewReplyReportDelete
i cant even open photos or reverse image search them on google from here, also some files refuse to load or are really slow
this has nearly gone on for a week now, pls fix
5 posts and 2 images omitted

No.4423 View ViewReplyReportDelete
>Posting from your IP range has been blocked due to abuse. [More Info]
>4chan Pass users can bypass this block. [Learn More]

No.4448 View ViewReplyReportDelete

Image errors

No.4439 View ViewReplyReportDelete
First time using desuarchive and when I'm looking at some archives, like 9/10 of the images dont load and clicking on them gives me a 522 error (s2.desu-usergeneratedcontent.xyz
Host Error), is there a way around this?

Why don't you show the number of replies a thread has

No.4429 View ViewReplyReportDelete
I really don't want to open threads with less than 100 replies, I like long tirades of autism

Im sorta stupid but

No.4425 View ViewReplyReportDelete
How do i find image hash to to use it for searching? i know i can just drag it but i want to know how it works
what do you guys use?
also, whats the software you guys use for the website? ive seen it on rebecca's archive as well