SatCLIP trains location and image encoders via contrastive learning, by matching images to their corresponding locations. This is analogous to the CLI

Search code, repositories, users, issues, pull requests...

submited by

Style Pass

2024-04-29 07:00:04

SatCLIP trains location and image encoders via contrastive learning, by matching images to their corresponding locations. This is analogous to the CLIP approach, which matches images to their corresponding text. Through this process, the location encoder learns characteristics of a location, as represented by satellite imagery. For more details, check out our paper.

Now, to train SatCLIP models, set the paths correctly, adapt training configs in clip/configs/default.yaml and train SatCLIP by running:

The S2-100K dataset is a dataset of 100,000 multi-spectral satellite images sampled from Sentinel-2 via the Microsoft Planetary Computer. Copernicus Sentinel data is captured between Jan 1, 2021 and May 17, 2023. The dataset is sampled approximately uniformly over landmass and only includes images without cloud coverage. The dataset is available for research purposes only. If you use the dataset, please cite our paper. More information on the dataset can be found in our paper.

We provide six pretrained SatCLIP models, trained with different vision encoders and spatial resolution hyperparameters $L$ (these indicate the number of Legendre polynomials used for spherical harmonics location encoding. Please refer to our paper for more details). The pretrained models can be downloaded as follows: