As a local weather scientist, Google Earth Engine (GEE) is a robust software in my toolkit. No extra downloading heavy satellite tv for pc photos to my pc.
GEE major API is Javascript, though Python customers also can entry a robust API to carry out related duties. Sadly, there are fewer supplies for studying GEE with Python.
Nonetheless, I like Python. Since I realized that GEE has a Python API, I imagined a world of prospects combining the highly effective GEE’s highly effective cloud-processing capabilities with Python frameworks.
The 5 classes come from my most up-to-date venture, which concerned analyzing water stability and drought in a water basin in Ecuador. However, the information, code snippets and examples might apply to any venture.
The story presents every lesson following the sequence of any knowledge evaluation venture: knowledge preparation (and planning), evaluation, and visualization.
Additionally it is value mentioning that I additionally present some common recommendation impartial of the language you utilize.
This text for GEE learners assumes an understanding of Python and a few geospatial ideas.
If you realize Python however are new to GEE (like me a while in the past), it is best to know that GEE has optimized features for processing satellite tv for pc photos. We received’t delve into the main points of those features right here; it is best to examine the official documentation.
Nonetheless, my recommendation is to examine first if a GEE can carry out the evaluation you wish to conduct. Once I first began utilizing GEE, I used it as a list for locating knowledge, relying solely on its fundamental features. I’d then write Python code for many of the analyses. Whereas this method can work, it typically results in important challenges. I’ll focus on these challenges in later classes.
Don’t restrict your self to studying solely the fundamental GEE features. If you realize Python (or coding typically), the educational curve for these features is just not very steep. Attempt to use them as a lot as potential — it’s value it when it comes to effectivity.
A closing be aware: GEE features even assist machine studying duties. These GEE features are straightforward to implement and can assist you resolve many issues. Solely once you can not resolve your downside with these features must you take into account writing Python code from scratch.
For instance for this lesson, take into account the implementation of a clustering algorithm.
Instance code with GEE features
# Pattern the picture to create enter for clustering
sample_points = clustering_image.pattern(
area=galapagos_aoi,
scale=30, # Scale in meters
numPixels=5000, # Variety of factors to pattern
geometries=False # Do not embrace geometry to avoid wasting reminiscence
)# Apply k-means clustering (unsupervised)
clusterer = ee.Clusterer.wekaKMeans(5).practice(sample_points)
# Cluster the picture
outcome = clustering_image.cluster(clusterer)
Instance code with Python
import rasterio
import numpy as np
from osgeo import gdal, gdal_array# Inform GDAL to throw Python exceptions and register all drivers
gdal.UseExceptions()
gdal.AllRegister()
# Open the .tiff file
img_ds = gdal.Open('Sentinel-2_L2A_Galapagos.tiff', gdal.GA_ReadOnly)
if img_ds is None:
increase FileNotFoundError("The desired file couldn't be opened.")
# Put together an empty array to retailer the picture knowledge for all bands
img = np.zeros(
(img_ds.RasterYSize, img_ds.RasterXSize, img_ds.RasterCount),
dtype=gdal_array.GDALTypeCodeToNumericTypeCode(img_ds.GetRasterBand(1).DataType),
)
# Learn every band into the corresponding slice of the array
for b in vary(img_ds.RasterCount):
img[:, :, b] = img_ds.GetRasterBand(b + 1).ReadAsArray()
print("Form of the picture with all bands:", img.form) # (peak, width, num_bands)
# Reshape for processing
new_shape = (img.form[0] * img.form[1], img.form[2]) # (num_pixels, num_bands)
X = img.reshape(new_shape)
print("Form of reshaped knowledge for all bands:", X.form) # (num_pixels, num_bands)
The primary block of code is just not solely shorter, however it’ll deal with the big satellite tv for pc datasets extra effectively as a result of GEE features are designed to scale throughout the cloud.
Whereas GEE’s features are highly effective, understanding the restrictions of cloud processing is essential when scaling up your venture.
Entry to free cloud computing assets to course of satellite tv for pc photos is a blessing. Nonetheless, it’s not stunning that GEE imposes limits to make sure truthful useful resource distribution. In case you plan to make use of it for a non-commercial large-scale venture (e.g. analysis deforestation within the Amazon area) and intend to remain throughout the free-tier limits it is best to plan accordingly. My common pointers are:
- Restrict the sizes of your areas, divide them, and work in batches. I didn’t want to do that in my venture as a result of I used to be working with a single small water basin. Nonetheless, in case your venture entails giant geographical areas this could be the primary logical step.
- Optimize your scripts by prioritizing utilizing GEE features (see Lesson 1).
- Select datasets that allow you to optimize computing energy. For instance, in my final venture, I used the Local weather Hazards Group InfraRed Precipitation with Station knowledge (CHIRPS). The unique dataset has a every day temporal decision. Nonetheless, it presents another model known as “PENTAD”, which supplies knowledge each 5 days. It corresponds to the sum of precipitation for these 5 days. Utilizing this dataset allowed me to avoid wasting pc energy by processing the compacted model with out sacrificing the standard of my outcomes.
- Look at the outline of your dataset, as it’d reveal scaling elements that would save pc energy. As an example, in my water stability venture, I used the Reasonable Decision Imaging Spectroradiometer (MODIS) knowledge. Particularly, the MOD16 dataset, which is a available Evapotranspiration (ET) product. In keeping with the documentation, I might multiply my outcomes by a scaling issue of 0.1. Scaling elements assist scale back storage necessities by adjusting the information sort.
- If worst involves worst, be ready to compromise. Cut back the decision of the analyses if the requirements of the research permit it. For instance, the “reduceRegion” GEE operate enables you to summarize the values of a area (sum, imply, and so on.). It has a parameter known as “scale” which lets you change the size of the evaluation. As an example, in case your satellite tv for pc knowledge has a decision of 10 m and GEE can’t course of your evaluation, you may regulate the size parameter to a decrease decision (e.g. 50 m).
For instance from my water stability and drought venture, take into account the next block of code:
# Cut back the gathering to a single picture (imply MSI over the time interval)
MSI_mean = MSI_collection.choose('MSI').imply().clip(pauteBasin)# Use reduceRegion to calculate the min and max
stats = MSI_mean.reduceRegion(
reducer=ee.Reducer.minMax(), # Reducer to get min and max
geometry=pauteBasin, # Specify the ROI
scale=500, # Scale in meters
maxPixels=1e9 # Most variety of pixels to course of
)
# Get the outcomes as a dictionary
min_max = stats.getInfo()
# Print the min and max values
print('Min and Max values:', min_max)
In my venture, I used a Sentinel-2 satellite tv for pc picture to calculate a moisture soil index (MSI). Then, I utilized the “reduceRegion” GEE operate, which calculates a abstract of values in a area (imply, sum, and so on.).
In my case, I wanted to search out the utmost and minimal MSI values to examine if my outcomes made sense. The next plot exhibits the MSI values spatially distributed in my research area.
The unique picture has a ten m decision. GEE struggled to course of the information. Subsequently, I used the size parameter and lowered the decision to 500 m. After altering this parameter GEE was capable of course of the information.
I’m obsessive about knowledge high quality. Because of this, I exploit knowledge however not often belief it with out verification. I like to speculate time in guaranteeing the information is prepared for evaluation. Nonetheless, don’t let picture corrections paralyze your progress.
My tendency to speculate an excessive amount of time with picture corrections stems from studying distant sensing and picture corrections “the previous means”. By this, I imply utilizing software program that assists in making use of atmospheric and geometric correction to pictures.
These days, scientific businesses supporting satellite tv for pc missions can ship photos with a excessive degree of preprocessing. In reality, an ideal function of GEE is its catalogue, which makes it straightforward to search out ready-to-use evaluation merchandise.
Preprocessing is essentially the most time-consuming job in any knowledge science venture. Subsequently, it have to be appropriately deliberate and managed.
One of the best method earlier than beginning a venture is to determine knowledge high quality requirements. Primarily based in your requirements, allocate sufficient time to search out one of the best product (which GEE facilitates) and apply solely the required corrections (e.g. cloud masking).
In case you love programming in Python (like me), you may typically end up coding every thing from scratch.
As a PhD pupil (beginning with coding), I wrote a script to carry out a t-test over a research area. Later, I found a Python library that carried out the identical job. Once I in contrast my script’s outcomes with these utilizing the library, the outcomes have been right. Nonetheless, utilizing the library from the beginning might have saved me time.
I’m sharing this lesson that will help you keep away from these foolish errors with GEE. I’ll point out two examples of my water stability venture.
Instance 1
To calculate the water stability in my basin, I wanted ET knowledge. ET is just not an noticed variable (like precipitation); it have to be calculated.
The ET calculation is just not trivial. You’ll be able to lookup the equations in textbooks and implement them in Python. Nonetheless, some researchers have revealed papers associated to this calculation and shared their outcomes with the neighborhood.
That is when GEE is available in. The GEE catalogue not solely supplies noticed knowledge (as I initially thought) but in addition many derived merchandise or modelled datasets (e.g. reanalysis knowledge, land cowl, vegetation indices, and so on.). Guess what? I discovered a ready-to-use world ET dataset within the GEE catalogue — a lifesaver!
Instance 2:
I additionally take into account myself a Geographic Info System (GIS) skilled. Through the years, I’ve acquired a considerable quantity of GIS knowledge for my work comparable to water basin boundaries in shapefile format.
In my water stability venture, my instinct was to import my water basin boundary shapefile to my GEE venture. From there, I remodeled the file right into a Geopandas object and continued my evaluation.
On this case, I wasn’t as fortunate as in Instance 1. I misplaced valuable time making an attempt to work with this Geopandas object which I couldn’t combine effectively with GEE. Finally, this method didn’t make sense. GEE does have in its catalogue a product for water basin boundaries that’s straightforward to deal with.
Thus, a key takeaway is to take care of your workflow inside GEE every time potential.
As talked about in the beginning of this text, integrating GEE with Python libraries may be extremely highly effective.
Nonetheless, even for easy analyses and plots, the mixing doesn’t appear easy.
That is the place Geemp is available in. Geemap is a Python package deal designed for interactive geospatial evaluation and visualization with GEE.
Moreover, I additionally discovered that it may help with creating static plots in Python. I made plots utilizing GEE and Geemap in my water stability and drought venture. The pictures included on this story used these instruments.
GEE is a robust software. Nonetheless, as a newbie, pitfalls are inevitable. This text supplies ideas and methods that will help you begin on the proper foot with GEE Python API.
European House Company (2025). European House Company. (Yr). Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Degree-2A.
Friedl, M., Sulla-Menashe, D. (2022). MODIS/Terra+Aqua Land Cowl Sort Yearly L3 World 500m SIN Grid V061 [Data set]. NASA EOSDIS Land Processes Distributed Lively Archive Middle. Accessed 2025–01–15 from https://doi.org/10.5067/MODIS/MCD12Q1.061
Lehner, B., Verdin, Ok., Jarvis, A. (2008): New world hydrography derived from spaceborne elevation knowledge. Eos, Transactions, AGU, 89(10): 93–94.
Lehner, B., Grill G. (2013): World river hydrography and community routing: baseline knowledge and new approaches to check the world’s giant river techniques. Hydrological Processes, 27(15): 2171–2186. Information is offered at www.hydrosheds.org