Revolutionary AI Breakthrough: Shrink LLMs by 50x with Attention Matching!

SUMMARY

AI Generated Content

MIT’s "Attention Matching" cuts LLM memory needs by 50x.
The technique compresses 1GB documents to 20MB, boosting throughput 5x.
AI developers can expect lower costs ($0.50/million tokens) & wider access.