Home Intro

Focus on the Dataset

The dataset used on this website comes from the FAO - Technical Platform on the Measurement and Reduction of Food Loss and Waste, the leading international resource on food loss along the supply chain.

Data origin

FAO compiles data from over 700 scientific publications, institutional reports, national studies, and global databases (World Bank, IFPRI, FAOSTAT…).The latest version includes more than 29,000 data points, continuously updated as part of a “living” database.

How the data are collected

FAO uses two methods:

  • Manual review and extraction from scientific and institutional sources;
  • Automated collection through the FAO Data Lab (web scraping, language detection and automatic extraction of products, places, supply chain stages and values).

Each record includes the methodology used to produce it.

What the data includes

The original dataset contains:

  • m49 code, country, region, cpc code, commodity, year, loss percentage, loss quantity (the data appears in 1/7 of the rows, mostly without units of measurement), activity, supply chain stage, treatment, cause of loss, sample size, method data collection, reference, url and notes.
  • Losses may be expressed as a percentage, by weight, or in economic value. FAO harmonizes units when possible and uses the data to calculate global indicators such as the Food Loss Index (SDG 12.3.1.a).

A rich but incomplete dataset

Some countries, products and supply chain stages have limited or uneven data, and causes of loss are often missing. These gaps reflect the complexity of monitoring global food losses, not a lack of effort from FAO, which has carried out extensive and valuable work aggregating highly diverse sources.

How we processed the dataset

To make the dataset clear and consistent for visual exploration, we:

  • chose to use the data in the country, commodity, year, loss percentage, food supply stage and cause of loss columns
  • standardized country names;
  • grouped food items into 16 broader categories;
  • grouped causes of loss into 14 categories;
  • calculated the average percentage of waste by commodity, country, and year;
  • identified the supply chain stage with the highest waste per commodity, noting the primary cause of loss where possible;

The final dataset is cleaner, more readable and optimized for interactive visualization.

Banana Lisca Sacco

How does it work? Project and Visualization

This project has a simple yet complex aim:

to turn a dataset on food waste into a visual experience that highlights both the data and its gaps.

What we want users to discover

  • How much food is lost around the world.
  • How food loss varies from country to country.
  • Where along the supply chain most losses occur, from harvest to processing.
  • How extensive the information gaps are: missing data, unmonitored countries, commodities without measurements.
  • That missing data is part of the problem: without measurement, effective policies can’t exist.

How to navigate the website

  • Start from the central matrix: each cell represents the intersection of a country and a commodity in a given year.
  • In the matrix, the various countries are ordered so that those containing the greatest amount of data are displayed first.
  • The size of the symbols shows the percentage of loss; empty cells reveal missing information.
  • The loss percentages were calculated by referring to the total quantity of the commodity produced in a given country in a given year.
  • Caution! Since we don't have the quantity (in kg) to which the percentage refers, we can't compare waste percentages between countries because, although the percentages may be the same, they could refer to two significantly different quantities.
  • By clicking a cell, you access a detailed view of that country, showing losses by food category.
  • Explore different years to observe changes or anomalies.

Every interaction helps you see not only how much food is lost, but also how much remains invisible.

Mela
Cestino