I’m a full-time software engineer 2, at the core of our platform team. In my scarce free time, I explore various aspects of the machine learning wor

Fine tune LLAMA3 on million scale dataset in consumer GPU using QLora, Deepspeed

submited by

Style Pass

2024-04-25 02:30:09

I’m a full-time software engineer 2, at the core of our platform team. In my scarce free time, I explore various aspects of the machine learning world, with interests in tabular data, NLP, and sound. Whatever I’m sharing here are scraps from all over the internet consolidated into one place. I have decent experience in training small NLP models and have submitted a solution in a Kaggle competition using DeBERTa v3, scoring enough to be in the top 50%, but I have never tried working with large language models. This is my first time, so please let me know if there are any oversights. Yes, this is my first blog post. Writing this will definitely help me, and hopefully, it will be useful for any readers as well

Who don’t know about this long necked creature revolutionizing the AI field from its birth. Joke apart release of llama where the whole OSS powered LLM kicked of the revolution which don’t seems like stopping in near future.

To learn more on llama in depth and technical do checkout this Post | LinkedIn , this is one of the most technically simplified explanation I can found all over the internet. Few things they implemented in their architecture like Grouped Multi Query Attention, KV-Cache, Rotary Positional Embeddings(RoPE) which are very cool. These are not in scope of this article. They continued releasing their versions of LLama with latest version came few days ago. And this time with massive data compacted into few GBs of parameters.

Evolutionary Deep Intelligence

Comment

It Will Never Work in Theory

Comment

Fine-tune and deploy the ProtBERT model for protein classification using Amazon SageMaker

Comment

Top Computer Vision Google Colab Notebooks

Comment

Google fined $270 million in France for unfair advertising practices

Comment

mlfoundations / open_clip

Comment

Using Household Rosters from Survey Data to Estimate All-cause Mortality during COVID in India

Comment

7 DIY Data Science Project Ideas Using Your Personal Data

Comment

NBMiner update unlocks up to 70% of NVIDIA RTX 30 LHR series mining performance

Comment

koaning.io: Bad Labels

Comment

Fine tune LLAMA3 on million scale dataset in consumer GPU using QLora, Deepspeed

Leave a Comment

Related Posts

Evolutionary Deep Intelligence

It Will Never Work in Theory

Fine-tune and deploy the ProtBERT model for protein classification using Amazon SageMaker

Top Computer Vision Google Colab Notebooks

Google fined $270 million in France for unfair advertising practices

mlfoundations / open_clip

Using Household Rosters from Survey Data to Estimate All-cause Mortality during COVID in India

7 DIY Data Science Project Ideas Using Your Personal Data

NBMiner update unlocks up to 70% of NVIDIA RTX 30 LHR series mining performance

koaning.io: Bad Labels

Recent Posts

A Robust Negative Relationship Between Self-Reports of Social Skills and Performance Measures of Social Intelligence

Still No Stag and Not Much Flation

Revisiting Rich Sutton's "The Bitter Lesson"

Microsoft doesn't want cops using Azure AI for facial recognition

Search code, repositories, users, issues, pull requests...

Storing and utilizing energy with innovative sulfur-based cathodes

Firefox Power User Keeps 7,400+ Browser Tabs Open for 2 Years

I Understand Thee, and Can Speak Thy Tongue: California Unlocks Shakespeare’s Gibberish

EXCLUSIVE Inside eerie abandoned weather station above the magnetic north pole in Canada that is a frozen time capsule from the 1970s

Was the Stone Age Actually the Wood Age?

Practical Product Discovery

The best way to improve security is to speed up

Climate emissions from air travel 50 per cent higher than reported

Kaspersky hits back at claims its AI helped Russia develop military drone systems

Customizing my Micro Editor ~/claromes.com

Judge believes Apple is violating Epic Games’ U.S. injunction, schedules evidentiary hearing before final decision

What Happens When NASA Loses Eyes on Earth? We’re About to Find Out.

Search code, repositories, users, issues, pull requests...

A Guide on GraphQL Authorization

Russia’s Anti-Satellite Nuke Could Leave Lower Orbit Unusable, Test Vehicle May Already Be Deployed