Comparison of State-of-the-Art and Traditional Networks for Image Segmentation on the ISPRS Potsdam Dataset
π² Final_Project π²
π Introduction
This project focuses on deep learning methods and provides scripts and notebooks that help build, train, and analyze models efficiently. The structure includes various components such as configuration settings, model creation tools, and utilities for model training and visualization.
π Project Structure
π00-Proposal: Contains the project proposal and related documents.
π01-Presentation: Includes materials for the final presentation, such as:
πDL_Presentation_Template.pptx: PowerPoint presentation template used for showcasing the project.
π02-Report: Folder for the final report detailing the project and results.
π03-Code: This folder includes all the essential code for the project.
- π configs.py: Contains configuration settings for the project (e.g., hyperparameters, file paths, etc.).
- π fmain.py: Main script to execute specific functionalities of the project.
- π main.py: The primary file to launch the projectβs main tasks.
- π tools: Additional tools and utilities used in the project.
- π README.md: This document, providing an overview of the project.
- π requirements.txt: A list of required packages and dependencies for the project.
- π bases
- π init .py
- π postdam_dataset.py
- π utils.py
- π wandbhelper.py
- π nets
- π init .py
- π feature_extractor.py
- π pertrain_net.py
- π tansunet.py
- π unet_attention_model.py
- π unet_Kan.py
- π unet_Mamba.py
- π unet_model.py
- π unet_parts.py
- π tests
- π EDA.ipynb
- π test_datagernerator.ipynb
- π test_dataset.py
- π test_model.ipynb
- π tooltest.ipynb
- π unet.ipynb
- π tools
- π callback.py
- π datagernerator.py
- π datasplitor.py
- π training_loop.py
Authors
- Menghua, Xie
- Leung, Yiu Chung
Dataset
ISPRS Postdam Dataset
this dataset include 37 pictures per type(IRRG, Digital surface model, label). each image 600060003
Model(s)
we used unet transunet, umamba and ukan as the figure showing.
Results
Regarding the effects of the sources of data No improvement for adding DSM for training Unet Noticeable improvement for large network, e.g. UKan, TransUnet
Regarding loss functions The experiments in UKan showed that there is no significant difference between HybridLoss, CrossEntropy, FocalLoss The failure in MultiClassDiceLoss might be due to the class imbalance in the dataset
Regarding the effects of learning rate For UKan & Unet, there is a significant improvement in training miou but not in validation miou A sign for overfitting, 1e-3 should be a better learning rate