Stream input data#

import warnings

import cartopy
import cartopy.crs as ccrs
import matplotlib as mpl
import matplotlib.pyplot as plt

import tams

warnings.filterwarnings("ignore", category=cartopy.io.DownloadWarning)

GPM precipitation and brightness temperature#

With the help of earthaccess, we can stream these data without downloading the files to disk.

%%time

pr = tams.data.get_imerg("2024-06-01 02:30")["pr"]
pr
CPU times: user 620 ms, sys: 199 ms, total: 819 ms
Wall time: 9.59 s
<xarray.DataArray 'pr' (lat: 1800, lon: 3600)> Size: 26MB
[6480000 values with dtype=float32]
Coordinates:
    time     datetime64[ns] 8B 2024-06-01T02:30:00
  * lon      (lon) float32 14kB -179.9 -179.9 -179.8 ... 179.8 179.8 179.9
  * lat      (lat) float32 7kB -89.95 -89.85 -89.75 -89.65 ... 89.75 89.85 89.95
Attributes:
    units:        mm/hr
    long_name:    precipitation rate
    description:  Complete merged microwave-infrared (gauge-adjusted) precipi...
%%time

tb = tams.data.get_mergir(pr.time.item())["tb"]
tb
CPU times: user 76.5 ms, sys: 28.8 ms, total: 105 ms
Wall time: 17.1 s
<xarray.DataArray 'tb' (lat: 3298, lon: 9896)> Size: 131MB
[32637008 values with dtype=float32]
Coordinates:
  * lon      (lon) float32 40kB -180.0 -179.9 -179.9 ... 179.9 179.9 180.0
  * lat      (lat) float32 13kB -59.98 -59.95 -59.91 ... 59.91 59.95 59.98
    time     datetime64[ns] 8B 2024-06-01T02:30:00
Attributes:
    units:          K
    standard_name:  brightness_temperature
    long_name:      brightness temperature
%%time

fig = plt.figure(figsize=(12, 5), layout="constrained")
ax = fig.add_subplot(projection=ccrs.PlateCarree())

ax.coastlines(color="magenta")

cmap = plt.get_cmap("gist_gray_r")
cmap.set_bad("red")
tb.plot(x="lon", cmap=cmap, robust=True, ax=ax)
pr.plot(x="lon", norm=mpl.colors.LogNorm(0.01, pr.quantile(0.99)), alpha=0.85, ax=ax);
CPU times: user 7.64 s, sys: 3.49 s, total: 11.1 s
Wall time: 16 s
../_images/4cfe4ae13e3c91655d5d250a23f7c294d6ab38502063919a8241c6b014cf02cc.png

The red dots in the brightness temperature field above represent missing data. For most CE identification methods, this will influence results. Since they are mostly scattered points, not large regions, a reasonable way to fill in the missing data is a nearest-neighbor interpolation.

%%time

kws = dict(method="nearest", fill_value="extrapolate", assume_sorted=True)
tb_ = tb.interpolate_na("lat", **kws).interpolate_na("lon", **kws)

print(f"{tb.isnull().sum().item() / tb.size:.3%} -> {tb_.isnull().sum().item() / tb_.size:.3%} null")
2.231% -> 0.000% null
CPU times: user 1.1 s, sys: 97.4 ms, total: 1.2 s
Wall time: 10.4 s

We zoom in to the system off the coast of Chile to demonstrate the impact.

box = dict(lon=slice(-115, -80), lat=slice(-53, -20))

fig, [ax1, ax2] = plt.subplots(
    2, 1,
    figsize=(5, 7),
    sharex=True, sharey=True,
    subplot_kw=dict(projection=ccrs.Mercator()),
    layout="constrained",
)

tb.sel(box).plot(cmap=cmap, robust=True, ax=ax1)
ax1.set_title("")

tb_.sel(box).plot(cmap=cmap, robust=True, ax=ax2)
ax2.set_title("");
../_images/81ffda06a0118b5773be95d237f3103097146c7c61d84b46d22551e33c24a9a0.png