Date of Award

5-2026

Degree Type

Masters Thesis

Degree Name

Master of Science (MS)

School

Computing Sciences and Computer Engineering

Committee Chair

Dr. Rabab Abdelfattah

Committee Chair School

Computing Sciences and Computer Engineering

Committee Member 2

Dr. Sarah Lee

Committee Member 2 School

Computing Sciences and Computer Engineering

Committee Member 3

Dr. Ahmed Sherif

Committee Member 3 School

Computing Sciences and Computer Engineering

Abstract

Crack detection and segmentation are fundamental tasks in structural health monitoring, enabling early identification of damage in critical infrastructure such as roads, bridges, and buildings. While deep learning models—particularly convolutional encoder--decoder architectures—have achieved high accuracy in pixel-level crack segmentation, their deployment in real-world environments introduces significant reliability challenges. In practical scenarios, especially in UAV-based inspection, visual conditions such as illumination variation, motion blur, low resolution, and occlusions can severely degrade segmentation performance. Moreover, the absence of ground-truth annotations during deployment makes conventional evaluation metrics, such as Intersection-over-Union and Dice score, inapplicable, creating a critical gap between model performance and operational trust.

This thesis reframes crack segmentation from a purely accuracy-driven task to a reliability-centered problem. First, it provides a comprehensive analysis of crack segmentation methods, highlighting limitations in thin-structure preservation, generalization, and robustness under real-world conditions. Building on these insights, the thesis introduces a novel semantic monitoring framework based on the LLM-as-Judge paradigm. In this framework, a lightweight crack segmentation model operates onboard a UAV, while a multimodal large language model evaluates segmentation outputs using visual reasoning, producing a quality score, confidence estimate, and explanatory feedback without requiring ground-truth annotations.

To ensure trustworthiness, a rigorous evaluation methodology is proposed, defining repeatability and sensitivity as key reliability criteria. Extensive experiments under controlled perturbations demonstrate that the proposed framework achieves stable, consistent, and perceptually meaningful evaluations. This work establishes a new direction for crack segmentation by enabling reliable, interpretable, and deployment-ready assessment in safety-critical environments.

ORCID ID

0009-0000-1699-5493

Copyright

Recommended Citation

Hasan, Murad, "LLM-as-Judge for Reliable Crack Segmentation in Edge-Based Structural Inspection" (2026). Master's Theses. 1196.
https://aquila.usm.edu/masters_theses/1196

Download

Available for download on Monday, May 31, 2027

Contact Author

Included in

Artificial Intelligence and Robotics Commons, Data Science Commons

COinS

Master's Theses

LLM-as-Judge for Reliable Crack Segmentation in Edge-Based Structural Inspection

Date of Award

Degree Type

Degree Name

School

Committee Chair

Committee Chair School

Committee Member 2

Committee Member 2 School

Committee Member 3

Committee Member 3 School

Abstract

ORCID ID

Copyright

Recommended Citation

Included in

Search

Browse

Author Corner

Master's Theses

LLM-as-Judge for Reliable Crack Segmentation in Edge-Based Structural Inspection

Author

Date of Award

Degree Type

Degree Name

School

Committee Chair

Committee Chair School

Committee Member 2

Committee Member 2 School

Committee Member 3

Committee Member 3 School

Abstract

ORCID ID

Copyright

Recommended Citation

Included in

Share

Search

Browse

Author Corner