Tracking source locations

submited by

Style Pass

2025-07-31 05:00:06

Futhark is a programming language meant for writing fast programs, but as is the case for every programming language meant for writing fast programs, it inevitably happens that a programmer will use it to write a program that is not fast. When this happens, the programmer will likely want to know why their program is not fast, and how to make it faster. A useful tool for answering such questions is a profiler - a tool that tells you how long the different parts of your program take to run. This post is about how profiling in Futhark became slightly more useful with the most recent release.

Initially, Futhark had no real profiling support, except for some semi-documented support for dumping a report of GPU operations. Eventually we added futhark profile, which allows the machine-readable profiling data produced by futhark bench to be turned into human-readable reports. Specifically, the Futhark runtime system will tally up the time spent in various cost centres, which for the GPU backends are GPU kernels and other operations such as copies, and put it in a table. However, the information you get out still looks like this:

Now a user may reasonable object: “Hold on! I don’t remember my program containing anything called main.segmap_23494!” And indeed, these cost centres refer to compiler-generated names. You can squint to get some meaning out of them: segscan is certainly some kind of scan operation, and segmap is a map. But due to inlining, it can be difficult to guess which functions result in which GPU operations, and optimisations may obscure the relation between source code and generated code - indeed, those segmap_intrablock operations are actually mainly (nested) scans that are then turned into block-level scans via incremental flattening. But clearly it is still not easy to use this information. The profiler will usually just tell the programmer that their program spends all its time executing code with a name the programmer cannot possibly recognise. What is missing is a way to relate generated code with the original source code. I decided to call such information provenance, in the sense of “the ultimate origin of something”. The problem is then to attach provenance to every bit of generated code, and in particular, to the generated GPU kernels.

Tracking source locations

Leave a Comment

Related Posts

Recent Posts

Suddenly, Trait-Based Embryo Selection - by Scott Alexander

Data Science Weekly Newsletter

Eyeball — Recaps For Your Mind 9+

How much would it cost to make an iPhone in America?

Search code, repositories, users, issues, pull requests...

Photographs of Auto Polo (ca. 1912)

How We're Making Business Software Talk to Each Other 10x Faster

New Data Shows Hacker News Posts Now 37% More Likely to Include the Phrase ‘Existential Risk’

The Intrinsic Perspective

Making the switch easier: our new Pinboard API bridge

SF Compute and Modular Partner to Revolutionize AI Inference Economics

China Reveals Blueprint for Ultra-Fast 10GW Microwave Weapon Using Superradiance Tech

New Google AI model maps world in 10-meter squares for machines to read

Elusive giant stick insect species discovered in rainforests of Far North Queensland

Read This Before You Trust Any AI-Written Code

you might not need tmux

Search code, repositories, users, issues, pull requests...

Age Verification Doesn’t Need to Be a Privacy Footgun

Your public ChatGPT queries are getting indexed by Google and other search engines

The Colorado River is officially contaminated with invasive zebra mussels. Can the state stop the spread?