This method is quite generic for reverse problems so can probably be use to discover some heuristics for SAT problems, or optimization problems, or ca

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-10-14 11:30:04

This method is quite generic for reverse problems so can probably be use to discover some heuristics for SAT problems, or optimization problems, or calibration of ill-defined models. Viewed from afar it's kind of training a global optimizer for your class of problem.

One alternative view is viewing the diffusion model as a degenerate case of RL game, where the reward is greedy (every step gets a reward corresponding to the distance to the solution). The temporal credit assignement is skipped to make the problem easier. To obtain more performance solving the inverse problem as a game with RL should be even better.

On the first run it will train a diffusion model, and save it for subsequent run. (The model is 126M so not sharing it on github and not trained for very long (less than 1hr on a RTX4090) (you can reduce or increase nbiter (number of training iterations) for faster training at the price of less performance ).

It will generate a random state, compute the next state, and then reverse this next state so we know there is at least a solution (not a garden of eden). Have not computed success statistics for but probably > 80% for 10x10 worlds with less than 100 iterations.

Leave a Comment