Published March 10, 2026 | Version v2
Dataset Open

GeoTree3D - Synthetic Trees with Aligned Orthophotos, DSMs, 3D Point Clouds, and L-Systems

  • 1. ROR icon TU Wien

Description

GeoTree3D is a synthetic dataset for learning-based reconstruction of 3D tree geometry from sparse top-down geospatial data. It consists of procedurally generated trees with aligned RGB orthophotos, Digital Surface Models (DSMs), colored 3D point clouds, and procedural L-system strings (L-strings). Orthophotos include realistic canopy appearance and shadows under varying illumination, while DSMs encode tree height and crown extent consistent with airborne elevation data. The L-strings compactly encode the explicit topological structure and branching rules of the generated geometry, enabling structural analysis and exact procedural reproduction of each tree. GeoTree3D supports supervised learning and controlled evaluation of tree reconstruction methods from minimal geospatial input.

The dataset is organized into three main folders:

  1. DSM, containing .mat files with heightmaps corresponding to individual trees; 

  2. ORTHOPHOTOS, containing one folder per tree. Each folder includes a subfolder named rendering, which contains images view_000.png to view_009.png rendered under different illumination conditions, along with a light_directions.txt file specifying the corresponding light directions.

  3. TREES, containing .mat files with colored 3D point clouds for each tree and a species_log.txt file mapping tree identifiers to species labels.

  4. LSTRINGS, containing .txt files of the procedural generation rules (L-systems) used to create each tree. These strings compactly encode the topological structure, branching patterns, and physiological growth rules of the trees.

All .mat files are binary MATLAB v5 files storing standard numeric arrays (e.g., heightmaps, 3D coordinates, RGB values). They do not require MATLAB and can be opened using common scientific computing tools, such as Python via scipy.io.loadmat.

To facilitate long-term reuse, we additionally provide a loading_data_scripts/ folder containing an example Python script mat_loader.py for loading DSMs and point clouds, while lstrings_visualizer.py parses and renders the procedural L-system structures. A requirements.txt file specifying the Python library versions used in our experiments is also included. The loading scripts were tested with Python 3.10 or newer.

To improve reproducibility, the scripts use two separate execution environments: (1) a standard Python environment with the dependencies listed in requirements.txt for running mat_loader.py, and (2) Blender’s Python environment for L-string visualization using lstrings_visualizer.py, since it relies on Blender’s Python API (bpy), which cannot be installed via pip.

L-strings can be visualized by running blender --python loading_data_scripts/lstrings_visualizer.py from the terminal.

The loading scripts are released under the MIT license.

 

Changelog

Version 2 (current release)

  • Added the LSTRINGS/ folder containing the procedural L-system strings used to generate each tree.

  • Added the lstrings_visualizer.py script in loading_data_scripts/ for parsing and visualizing L-strings in Blender.

  • Added documentation and execution instructions for the L-string visualization workflow.

Version 1

  • Initial release of the GeoTree3D dataset, including DSM heightmaps, orthophotos with multiple illumination conditions, and colored 3D tree point clouds.

  • Provided mat_loader.py and requirements.txt for loading DSMs and point cloud data.

 

Files

loading_data_scripts.zip

Files (3.4 GiB)

NameSize
md5:7f3161aae5f30d9d81acf2004aa30f63
5.1 KiBPreview Download
md5:e569dff64a6814b72851d4a9503dcaa8
3.4 GiBPreview Download