►Recent Highlights from the Previous Thread:
>>98764931--Chat demo and info - ~240t/s 70b, ~460t/s 8x7b on Groq ML Accelerator cards:
>>98765082 >>98765152 >>98765177 >>98765181 >>98765326 >>98765473 >>98765467 >>98765373 >>98765443--Paper: KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization:
>>98768726 >>98768756 >>98768758 >>98768779 >>98768808--Transformers/NNs and determinism:
>>98767149 >>98767389--TaskWeaver - multi-stage chain-of-thought style tasks:
>>98765584--RWKV-5 Twitter post and demo, Anon tests it:
>>98768853 >>98768911 >>98768956 >>98769014 >>98769029 >>98769065 >>98769069--llava-v1.6-34b translates game dialogue, Anon links surya OCR software:
>>98769936 >>98769960 >>98769976 >>98770143 >>98770151 >>98770371 >>98770593 >>98770445 >>98770521--miquella-120b GGUF quants:
>>98768216 >>98768237 >>98768251 >>98768273--OpenAI post about LLMs assisting in the creation of biological threats:
>>98767625 >>98767724 >>98768600 >>98768651 >>98768766--Git Guide: How to use Quadratic Sampling in ooba and ST in the interim:
>>98766039--exllamav2 quadsampling open PR:
>>98770879--Elf hands Anon off:
>>98765982--Rock paper scissors test:
>>98770553 >>98770658 >>98770805 >>98770856 >>98770857--LLMs and emojis:
>>98771479 >>98771490 >>98771495 >>98771523 >>98771560 >>98771563 >>98771807--Logs: routersex:
>>98765070--Logs: toastersex:
>>98765844--Mikus and Rins (free space):
>>98768546 >>98767216 >>98768604 >>98767520 >>98767433 >>98767520 >>98771170 >>98771204 >>98772016►Recent Highlight Post from the Previous Thread:
>>98765002