Metal FlashAttention underpins Draw Things’ claim of fastest image generation inside the Apple ecosystem. It conserves system memory, it is fast and

Metal FlashAttention 2.0: Pushing Forward On-Device Inference & Training on Apple Silicon

submited by

Style Pass

2025-01-07 18:30:04

Metal FlashAttention underpins Draw Things’ claim of fastest image generation inside the Apple ecosystem. It conserves system memory, it is fast and it supports a wide-array of devices with the oldest being iPhone 12 from more than 4 years ago.

Back in September, Philip Turner and I released Draw Things with Metal FlashAttention 2.0. Since then, we’ve integrated not only the forward pass (useful for inference) but also the experimental backward pass (useful for training). Combining together, Draw Things is the only efficient application on macOS / iOS that supports both inferencing and fine-tuning FLUX.1 [dev], a 11B-parameter, state-of-the-art image generation model. This major version upgrade delivers:

Translating these gains into real-world numbers, we see up to 20% improvement on inference for FLUX.1 models on M3 / M4 devices, up to 20% improvement on inference for SD3 / AuraFlow models on M3 / M4 devices. Similar improvements for SD3 / AuraFlow for older hardware and around 2% improvement on older hardware for FLUX.1 models.

Compared to other implementations, FLUX.1 integrated inside Draw Things is up to 25% faster than mflux implementation on M2 Ultra for each iteration, and more for end-to-end times; it is up to 94% faster than ggml implementations (also known as gguf format). SD Large 3.5 integrated inside Draw Things is up to 163% faster than DiffusionKit implementation for each iteration (on M2 Ultra).

Adobe Illustrator and InDesign will run natively on Apple Silicon: Create faster and more efficiently than ever before

Comment

Is it safe to use a micro-USB 2 cable in a USB 3 portable hard drive?

Comment

Xbox Cloud Gaming: Now Running on Xbox Series X; Expanded PC and Apple Device Availability

Comment

Etymology of the use of “Drive” to refer to a digital storage medium

Comment

Roundup of Events for Bootstrappers in August 2021

Comment

Ryedale Coin | Copper Penny Sorter

Comment

M1 MacBook Screen Crack Investigation

Comment

A natively flexible 32-bit Arm microprocessor

Comment

Apple employees organizing for 'systemic change' in workplace

Comment

Alyssa Rosenzweig on Twitter: "Bare metal Apple M1 Debian at 4K@60. Display driver development, day #7. A million thanks to @svenpeter42 for helping me tread water in the Linux kernel driver deep end and to @marcan42 for reverse-engineering the Apple display controller.… https://t.co/afo33YfyV8"

Comment

Metal FlashAttention 2.0: Pushing Forward On-Device Inference & Training on Apple Silicon

Leave a Comment

Related Posts

Adobe Illustrator and InDesign will run natively on Apple Silicon: Create faster and more efficiently than ever before

Is it safe to use a micro-USB 2 cable in a USB 3 portable hard drive?

Xbox Cloud Gaming: Now Running on Xbox Series X; Expanded PC and Apple Device Availability

Etymology of the use of “Drive” to refer to a digital storage medium

Roundup of Events for Bootstrappers in August 2021

Ryedale Coin | Copper Penny Sorter

M1 MacBook Screen Crack Investigation

A natively flexible 32-bit Arm microprocessor

Apple employees organizing for 'systemic change' in workplace

Alyssa Rosenzweig on Twitter: "Bare metal Apple M1 Debian at 4K@60. Display driver development, day #7. A million thanks to @svenpeter42 for helping me tread water in the Linux kernel driver deep end and to @marcan42 for reverse-engineering the Apple display controller.… https://t.co/afo33YfyV8"

Recent Posts

the analyst's journal for tracking and retrieving findings

The Ladder of Artificial Agency

The Anti-Social Century

The toxic legacy of Agent Blue

James Webb telescope spies record-breaking hoard of stars hiding in a warped 'dragon' galaxy

NVIDIA RTX Neural Rendering Introduces Next Era of AI-Powered Graphics Innovation

As Elon Musk Promotes Far-Right German Party, EU Politicians Suggest Shutting Off X’s Algorithm

Problem gamblers' brains rely on slow learning, increasing losses: Study

Computer Science > Artificial Intelligence

Yale Environment 360

RFK Jr. and the End of Enlightenment Rationality - The Atlantic

1-billion row challenge with Node.js

How meditation deconstructs your mind

LA fires live updates: 1,400 firefighters battling 'unprecedented' fires, Newsom says

Magic Cache for GitHub Actions

Transform Your Menu Elevate Your Dining Experience.

DuckDB Database File as a New Standard for Sharing Data?

CSS Hover Effects: 40 Engaging Animations To Try

Mark Zuckerberg’s Political Evolution, From Apologies to No More Apologies

Looking Through the Past