parallel tree running

submited by
Style Pass
2022-05-16 03:00:04

For the first version of where’s all the code I did a simple tree walk to find all the code, one file at a time. But I wrote it in go, a supposedly concurrency friendly language, so it seemed the obvious thing should be to walk the tree in parallel. This went great, until it didn’t, but then it ended up okay. intro

We want to walk a tree in parallel. We don’t know how big or deep it is. We will discover new nodes of unknown branchiness as we go. As we progress, we will end up performing two types of work. Sometimes we find a directory and recurse deeper. Sometimes we find a file and have to count the lines.

Ideally, our code will balance IO and CPU demands. I would also prefer that it not completely hammer my computer. I’m already running a web browser. I don’t need a second application that behaves like I bought my computer for the sole purpose of running it.

As a bonus, it’d be nice if we have some way to balance tasks. Like two tree walkers and two line counters. Or decide in which order things get done. Generally, more control over what happens would be better than leaving it to the whims of some other component. If you know me, you’ll know that bad things happen when magic is involved. naive

Leave a Comment