1. Dataset

This page releases datasets and software of the Digital Typhoon project. The datasets are useful not only for machine learning but also for the quantitative analysis of meteorological studies.

Dataset

The dataset is a typhoon-centered image dataset created from hourly meteorological satellite infrared channel images. Data from successive generations of Himawari weather satellite images since 1978 are converted to brightness temperatures, and calibrated for different satellite sensor observations, resulting in a uniform spatio-temporal dataset spanning more than 40 years.

  1. Western North Pacific Dataset: 1978-2022 - 54GB

"Digital Typhoon Dataset" (National Institute of Informatics) is licensed under a Creative Commons Attribution 4.0 International License (CC BY). When you use the dataset, please cite the dataset as follows.

Digital Typhoon Dataset (National Institute of Informatics), doi:10.20783/DIAS.664

About the meteorological satellite Himawari imagery, most of them were purchased from Japan Meteorological Business Support Center, but some of them were received at Institute of Industrial Science, University of Tokyo. Please see the details at Sources of Various Data.

Paper

The following paper describes the detail of this dataset. Please read it first.

Asanobu Kitamoto*1*2, Jared Hwang*3*1, Bastien Vuillod*4*1, Lucas Gautier*5*1, Yingtao Tian*6, Tarin Clanuwat*6, "Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones", NeurIPS 2023 Datasets and Benchmarks (Spotlight), 2023

The same paper is also available from arXiv as Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones.

Please browse the list of publications to find other references on typhoons (filter by 'typhoon').

Software

We release software for machine learning of the digital typhoon datasets.

  1. kitamoto-lab/digital-typhoon @ GitHub
  2. pyphoon2

Model

We release deep learning models and software code for the Digital Typhoon Dataset.

  1. Kitamoto Lab @ Hugging Face

Data Repository

Digital Typhoon Dataset is also available from DIAS (Data Integration and Analysis System). DIAS is a Japanese data repository for earth science and environmental datasets and can assign the dataset DOI (Digital Object Identifier) (doi:10.20783/DIAS.664) as a persistent identifier.

  1. Digital Typhoon Dataset

Acknowledgment

Many people have contributed to the development of the Digital Typhoon dataset. In particular, the following internship students of Kitamoto laboratory has been involved in the research of machine learning algorithms and the development of software libraries for the dataset.

Danlan Chen, Lucas Rodes Guirao, Alexander Grishin, Clément Playout, Izabela Horvath, Jean-Paul Lam, Jared Hwang, Bastien Vuillod, Lucas Limos Gautier

In addition, the pyphoon2 library to manage the dataset was inspired by the first version of pyphoon (the main contributor: Lucas Rodes Guirao), and has been developed to deal with new tasks and data formats.