Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a

gradientai / Llama-3-8B-Instruct-262k like 14

submited by

Style Pass

2024-04-26 00:30:05

Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.

This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from Crusoe Energy. It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.

We build on top of the EasyContext Blockwise RingAttention library [3] to scalably and efficiently train on contexts up to 262144 tokens on Crusoe Energy high performance L40S cluster.

Gradient is accelerating AI transformation across industries. Our AI Foundry incorporates your data to deploy autonomous assistants that power critical operations across your business.

[1] Peng, Bowen, et al. "Yarn: Efficient context window extension of large language models." arXiv preprint arXiv:2309.00071 (2023).

gradientai / Llama-3-8B-Instruct-262k like 14

Leave a Comment

Related Posts

Recent Posts

VICTORY – Biden DHS Disbands Illegal “Homeland Intelligence Experts Group” Including Russia Collusion Hoaxers John Brennan and James Clapper Following Lawsuit Filed by AFL and Former Ambassador Ric Grenell

These Couples Survived a Lot. Then Came Retirement.

bring back boom-and-bust

World's 1st 'tooth regrowth medicine' to be tested in Japan from Sept. 2024

Apostrophe ban council backs down and reinstates punctuation

This L.A. developer aims to tear down homes to build apartments where the city doesn’t want them

Hallucination-Free RAG: How we bring trust to Healthcare AI

Migrations Done Well: Typical Migration Approaches

Events as the 4th pillar of Booking Observability platform

Machine Unlearning in 2024

Search code, repositories, users, issues, pull requests...

Google Reduces Support For Python, Dart And Flutter

Sending Emails to my 3-year-old

Building an interactive coloring world with Swiss trains and my 4 year old nephew 🎨🇨🇭🚂

Tips for getting started with OpenSource

Towards transparent and durable copper-containing antimicrobial surfaces

Dock: A Tasty, Creamy Spring Green

Scientists discover a new type of porous material that can store greenhouse gases

X is for... — The Public Domain Review

Ted Jorgensen – The Sad Story of Jeff Bezos’ Biological Father