WheatGOAT

WheatGOAT: Measurement System

Introduction

WheatGOAT has five main parts: a backbone to extract features, an Object-Background Selector (OBS) to separate wheat ears from the background, an Object Perception Learner (OPL) that uses vision-language models to better identify small or overlapping ears, a Hierarchical Refiner (HR) to improve detail using multi-scale context, and a Prediction Module (PM) that creates a precise density map for counting. This design boosts accuracy and works well in varied field conditions and crop types.

Dependencies

CUDA 11.8
Python 3.8 (or later)
torch==1.13.1
torchaudio==0.13.1
torchcam==0.3.2
torchgeo==0.4.1
torchmetrics==0.11.4
torchvision==0.14.1
numpy==1.21.6
Pillow==9.2.0
einops==0.6.0
opencv-python==4.6.0.66

GitHub: https://github.com/GZU-SAMLab/WheatGOAT

Code and Data

The code can be downloaded from there. We use the global wheat Head Detection 2021 (dataset) for training. The density map is marked in the annotation.zip compressed package.

Pre-trained models

Here are the trained WheatGOAT weights, the first ten layers of VGG16 weights, and the fine-tuned YOLOv8 weights.

Get Started

Training

Step1：Please divide the data set according to the csv file.
Step2：Change the dataset path in the train.py file to the location of the dataset.
Step3：

  python train.py

Testing

Step1：Please download the weights in the current directory.
Step2：Please change the relevant path in the test.py file
Step3：

  python test.py

Results

Quantitative Analysis

These results demonstrate WheatGOAT’s superior accuracy, robustness, and generalization in complex field conditions, attributed to its object-aware perception and hierarchical refinement strategy.

Figure 1：Quantitative comparison of wheat ear counting performance across different models on the GWHD_2021 dataset.

Visualization of Model Decision-Making

Figure 2: Visualization of model attention using Grad-CAM across various wheat ear counting methods.