New RNN Architecture Surpasses Transformer: Each Hidden State Is a Model, First Author Says It Fundamentally Changes Language Models
A new RNN architecture outperforms Transformers by treating each hidden state as an independent model. The first author claims this approach fundamentally changes language modeling.