Signal Processing
Basic info
Area Monitoring system depends on the vast amount of full-area, multi-temporal indirect signals from the remote sensing data, which are translated into interpretable markers and scenarios, derived over features of interest (FOI). The initial inputs of the Area Monitoring system are therefore farmers’ data from the Geospatial Aid Application (GSAA) dataset, providing information about parcel boundaries, attributes and measures, and the available satellite imagery. We extract the reflectance values from the imagery pixels that are completely within FOIs boundaries. The figure below shows a true-colour visualisation from four different Sentinel-2 observations for a typical FOI with corn. The FOI’s boundaries are shown in yellow and the boundary of all non-border Sentinel-2 pixels within it is shown in red.
The reflectances are then converted to vegetation indices, which are statistically summarised (obtaining mean, standard deviation, minimum and maximum) per FOI. Furthermore, we need to apply an appropriate data filtering to obtain high-quality data for detecting changes triggered by agricultural activities. The filtering includes removal of the cloudy observations and of other invalid observations.
Time-series of mean Normalized Difference Vegetation Index (NDVI) for the same FOI as mentioned above is shown as a green dashed and solid line in the figure below. Sudden drops of NDVI are due to invalid observations, which were identified and filtered out with Sentinel Hub’s s2cloudless cloud masking algorithm and observation outlier detection algorithm. The red vertical line on the time-series plot corresponds to the first Sentinel 2 image on the left. The green dots indicate all remaining valid observations which are used for the calculation of the markers.
Copernicus Sentinel data are, due to their radiometric characteristics, multi-temporal richness and affordability over large areas, the main data source for the Area Monitoring. The Sentinel-2 can be used for monitoring of the majority of the agricultural area and this is why the following sections describe the processing its data. We also use a commercial very-high resolution Planet Fusion data, for the parcels not monitorable with the Sentinel-2. You can read more about the challenges of small parcels in this blog post.
Further info
Download and processing of satellite images over large areas
If we want to obtain the signals from the satellite images over larger areas, e.g. for the whole country, for several months, we need to do this efficiently. For this purpose, we are using the Sentinel Hub’s Batch Processing API
, which splits the area of interest in managable smaller chunks, downloads various indices and raw bands for each available date, then creates a harmonized time-series feature by filtering out cloudy data and interpolating values to get uniform temporal periods. The outputs of a batch processing are stored to the object storage. On the image below, workflow overview of the Batch Processing commands, and the different statuses that can be triggered.
Cloud detection
Cloud detection is the most crucial step during the pre-processing of optical satellite images. Failure to mask out the clouds from the image will have a significant negative impact on any subsequent analyses. We are using pixel-based cloud mask on Sentinel-2 imagery computed with the in-house developed machine learning algorithm, s2cloudless, which has also become one of the state-of-the-art algorithms for cloud detection.
s2cloudless
assigns each pixel a cloud probability based on the pixel’s ten Sentinel-2 band values. Cloud probabilities and masks are available through the Sentinel Hub services when requesting L1C or L2A data for the entire Sentinel-2 archive. Below, a screenshot from EO Browser with a simple custom script for masking out the clouds using the cloud mask information from the Sentinel Hub service.
Outlier detection
Once the cloudy observations have been filtered out, we are left with a mixture of valid observations and a set of undetected anomalous observations. The latter are mostly caused by cloud shadows, snow and haze. Below, see some examples of outliers.
We have developed a supervised machine learning model, which was trained on hand-labelled data of agricultural parcels over Slovenia, collected in 2019. The output of the model is binary: an observation is classified as an outlier if the pseudo-probability is above some threshold. Below, the NDVI time-series of a FOI with filtered out outlier observations - the marked observations correspond to the outlier examples from the previous image.
Links
Blog post about large-scale data preparation
Blog post about data handling
Blog post about about development of the cloud detection algorithm
Blog post about cloud masks available on Sentinel Hub
Blog post about observation outlier detection