MyGit

v0.5.0

microsoft/torchgeo

版本发布时间: 2023-10-01 05:53:35

microsoft/torchgeo最新发布版本:v0.6.1(2024-10-11 02:22:20)

TorchGeo 0.5.0 Release Notes

0.5.0 encompasses over 8 months of hard work and new features contributed by 20 users from around the world. Below, we detail specific features worth highlighting.

Highlights of this release

New command-line interface

TorchGeo has always had tight integration with PyTorch Lightning, including datamodules for common benchmark datasets and trainers for most computer vision tasks. TorchGeo 0.5.0 introduces a new command-line interface for model training based on LightningCLI. It can be invoked in two ways:

# If torchgeo has been installed
torchgeo
# If torchgeo has been installed, or if it has been cloned to the current directory
python3 -m torchgeo

It supports command-line configuration or YAML/JSON config files. Valid options can be found from the help messages:

# See valid stages
torchgeo --help
# See valid trainer options
torchgeo fit --help
# See valid model options
torchgeo fit --model.help ClassificationTask
# See valid data options
torchgeo fit --data.help EuroSAT100DataModule

Using the following config file:

trainer:
  max_epochs: 20
model:
  class_path: ClassificationTask
  init_args:
    model: "resnet18"
    in_channels: 13
    num_classes: 10
data:
  class_path: EuroSAT100DataModule
  init_args:
    batch_size: 8
  dict_kwargs:
    download: true

we can see the script in action:

# Train and validate a model
torchgeo fit --config config.yaml
# Validate-only
torchgeo validate --config config.yaml
# Calculate and report test accuracy
torchgeo test --config config.yaml

It can also be imported and used in a Python script if you need to extend it to add new features:

from torchgeo.main import main

main(["fit", "--config", "config.yaml"])

See the Lightning documentation for more details.

Self-supervised learning and Landsat

SSL4EO-S12 Logo

Self-supervised learning has become a dominant technique for model pre-training, especially in domains (like remote sensing) that are rich in data but lacking in large labeled datasets. The 0.5.0 release adds powerful trainers for the following SSL techniques:

large unlabeled datasets for multiple satellite platforms:

and the first ever models pre-trained on Landsat imagery. See our SSL4EO-L paper for more details.

Utilities for splitting GeoDatasets

In prior releases, the only way to create train/val/test splits of GeoDatasets was to use a Sampler roi. This limited the types of splits you could perform, and was unintuitive for users coming from PyTorch where the dataset can be split into multiple datasets. TorchGeo 0.5.0 introduces new splitting utilities for GeoDatasets in torchgeo.datasets, including:

Splitting with a Sampler roi is not yet deprecated, but users are encouraged to adopt the new dataset splitting utility functions.

GeoDatasets now accept lists as input

Previously, each GeoDataset accepted a single root directory as input. Now, users can pass one or more directories, or a list of files they want to include. At first glance, this doesn't seem like a big deal, but it actually opens a lot of possibilities for how users can construct GeoDatasets. For example, users can use custom filters:

files = []
for file in glob.glob("*.tif"):
    # check pixel QA band or metadata file
    if cloud_cover < 20:  # select images with minimal cloud cover
        files.append(file)
ds = Landsat8(files)

or use remote files from S3 buckets or Azure blob storage. Basically, as long as GDAL knows how to read the file, TorchGeo supports it, wherever the file lives.

Note that some datasets may not support a list of files if you also want to automatically download the dataset because we need to know the directory to download to.

Building a community

With over 50 contributors from around the world, we needed a better way to discuss ideas and share announcements. TorchGeo now has a public Slack channel! Join us and say hello 👋

Now that the majority of the features we've needed have been implemented, one of our goals for the next release is to improve our documentation and tutorials. Expect to see TorchGeo tutorials at all the popular ML/RS conferences next year! We're excited to meet our users in person and learn more about their unique use cases and needs.

Backwards-incompatible changes

Dependencies

Datamodules

New datamodules:

Changes to existing datamodules:

New base classes:

Changes to existing base classes:

Datasets

New datasets:

Changes to existing datasets:

Changes to existing base classes:

New utility functions:

Models

Changes to existing models:

New pre-trained model weights:

Changes to existing pre-trained model weights:

Samplers

Changes to existing samplers:

Trainers

New trainers:

Changes to existing trainers:

New base classes:

Transforms

New transforms:

Scripts

New scripts:

Documentation

Testing

Contributors

This release is thanks to the following contributors:

@AABNassim @adamjstewart @adrianboguszewski @adriantre @ashnair1 @briktor @burakekim @calebrob6 @dkosm @estherrolf @isaaccorley @nilsleh @nsutezo @ntw-au @pmandiola @shradhasehgal @Tarandeep97 @urbanophile @wangyi111 @yichiac

相关地址:原始地址 下载(tar) 下载(zip)

查看:2023-10-01发行的版本