Generative modeling with sparse transformers
Weβve developed the Sparse Transformer, a deep neural network which sets new records at predicting what comes next in a sequenceβwhether text, images, or sound. It uses an algorithmic improvement of theΒ attentionΒ mechanism to extract patterns from sequences 30x longer than possibleΒ previously.
Log in to bookmark articles and create collections
Isabella News