THERE'S A VERY SIMPLE WORKAROUND FOR THE FILENAME EMOJI PROBLEM, if upgrading the DB is causing you trouble:
1) Convert emojis in scraped filenames to unicode hex values, using a prefix and suffix with a forward slash character.
For example, the grinning face emojihttp://www.fileformat.info/info/unicode/char/1f600/index.htm
would become "/UF09F9880/" in the filename (if you wanted to encode to hex UTF-8).
2) Notice that it's IMPOSSIBLE for the original 4chan filename to contain a forward slash. That means you can use it as a reserved character, i.e. you can be certain that any forward slashes in DB filenames are ones that you've inserted.
That way, when you eventually upgrade your DB, you can just run through each record and convert them by scanning for the prefix "/U" (marking the start of the Unicode codepoint) and looking for the next forward slash (marking the end of the Unicode codepoint).
3) Also notice that when any major web browser encounters a filename with an illegal character for that client's filesystem (e.g. a forward slash), it just replaces it with a legal character like a space. So there's very little inconvenience to users. Or at least, it's a much better situation than having threads missing hundreds of posts.