Exploring Land Cover Data (Impact Observatory)

General Exploration Standard Python

license binder render review

RoHub doi



Introduce manipulation and exploratory analysis of classified land use and cover data, using example data created by Impact Observatory from ESA Sentinel-2 imagery.

Dataset description

There are now many classified (categorical) land cover data products freely available that are useful for Environmental Data Science. These include:

These products are provided as 2D rasters (spatial) or 3D data cubes (spatio-temporal). The number and classification of discrete land cover classes varies between products, but at their most basic will distinguish between broad land covers such as ‘crops’, ‘forest’ and ‘built-up’. The nominal (categorical) character of the data influences the types of analysis appropriate.

This notebook uses data created by Impact Observatory. The data are a time series for 2017-2021 of annual global land use and land cover (LULC) mapped at 10m spatial resolution. The data are derived from ESA Sentinel-2 imagery with each annual map specifying individual pixels as belonging to one of 9 LULC classes. The Impact Observatory LULC model uses deep learning methods to infer a single annual LULC class for each pixel in a Sentinel-2 image. Each annual global LULC map is produced by aggregating multiple inferences for images from across a given year (requiring processing approximately 2 million images to create each annual map).




  • James Millington (author), Dept of Geography, King’s College London, @jamesdamillington

  • Anne Fouilloux (reviewer), Dept of Geosciences, University of Oslo, @annefou

  • Amandine Debus (reviewer), Dept of Geography, University of Cambridge, @aedebus

Dataset originator/creator

The data are available under a Creative Commons BY-4.0 license.

Dataset reference and documentation


Load libraries

import os
import warnings

#data handling
import pystac_client
import odc.stac
from pystac.extensions.item_assets import ItemAssetsExtension

import geopandas as gpd
import rasterio as rio
import numpy as np
import pandas as pd
from shapely.geometry import Polygon
import xarray as xr
import rioxarray

import matplotlib.pyplot as plt
import matplotlib.colors as mplc
import holoviews as hv
import hvplot.pandas  
from holoviews import opts, dim

#data analysis
from sklearn import metrics  #for confusion matrix
from rasterstats import zonal_stats