>>59932411,11It's gone, thanks for the report.
I am in a classroom on iPad so I'll answer ad-hoc later at home.
Some numbers to give an idea:
We are using 250gb per month for the full images. We have a 2tb hard disk, but I'd dedicate for the archive about 1tb, flexibly.
We're using 1.5tb of bandwidth a month on 10tb, but count into it the fetching and our manga reader, so it's about 1tb generated by users.
In the last 3 months the amount of users more than doubled (now 15k users daily, 1.8m pageviews a month), but this is already the most recent data. The trend might continue, but we're a bit too far to worry yet. The issue is mostly the hard disk space, and getting rid of illegal images. I don't think it's worth to save everything, but surely it's worth saving the 1-2%.
As in performance, the number of users is hardly an issue, we could take 3000 users on at the same time and we hardly reach 200 during 4chan downtimes. Probably a bigger issue is that the server is in France and some users might have higher ping, resulting in 1s slower loading.
I'm thinking up a rating system out of all suggestions, and one thing is sure: it's going to be lots of work. I'll be able to detect spam with more precision, as well as be able to create trends and selections of popular threads, have a safe for work filter. This could be brilliant if implemented in a discrete way, a disaster if it becomes FEATURES.
We'll have to process threads after their death, and over time all the threads in the archive. That's going to take long, but we have already a cron job function. Even if it took a month or two it would be fine.
Fun fun.