Comprehensive Literature Survey Summary

Title	Authors	Year & Venue	Technology / Method	Key Observations & Results	Strengths & Limitations
Plant leaf disease detection using vision transformers for precision agriculture	Murugavalli S, Gopi R	2025, Scientific Reports	PLA-ViT: Vision Transformer with multi-head self-attention, data augmentation, bilateral filtering, transfer learning	Accuracy: 98.7%. Dataset: New Plant Diseases Dataset (Kaggle)	Strengths: Outperforms CNNs; efficient (12 ms). Limitations: Attention blocks weaker on certain tasks; needs multi-label classification.
Multispectral Plant Disease Detection with Vision Transformer–Convolutional Neural Network Hybrid Approaches	De Silva M, Brown D	2023, Sensors	Hybrid CNN-ViT: Xception, ResNet vs ViT-B16 on multispectral images	Accuracy: 83.3%. F1-Score: Highest for ViT-B16. Dataset: Custom balanced multispectral (2652 images)	Strengths: Comprehensive comparison of models; creates a new balanced multispectral dataset. Limitations: Small dataset size limited performance of larger models; misclassifications between species with similar leaf shapes.
A hybrid Framework for plant leaf disease detection and classification using convolutional neural networks and vision transformer	Aboelenin S, et al.	2025, Complex & Intelligent Systems	Hybrid CNN + Ensemble + ViT: Ensemble of VGG16, Inception-V3, DenseNet20 + ViT for local features	Accuracy: 99.24% (Apple), 98% (Corn). Dataset: PlantVillage (Apple and Corn subsets).	Strengths: High accuracy by combining strengths of multiple CNNs and a ViT. Limitations: Evaluated on lab-based PlantVillage data; real-world performance is unverified.
A Deep Features Extraction Model Based on the Transfer Learning Model and Vision Transformer "TLMViT" for Plant Disease Classification 21	Tabbakh A, Barpanda S	2023, IEEE Access	TLMViT: A sequential hybrid model using a pre-trained CNN for feature extraction followed by a ViT for classification.	Accuracy: Not explicitly stated but proves the efficiency of using ViT for deep feature processing. Dataset: Not specified, likely a standard plant disease dataset.	Strengths: Demonstrates a clear and effective hybrid architecture. Limitations: The sequential hybrid approach is becoming a common pattern, potentially limiting novelty.
Basil plant leaf disease detection using amalgam based deep learning models	Mane D, et al.	2024, Journal of Autonomous Intelligence	Hybrid CNN+SVM: A CNN is used for feature extraction, and a Support Vector Machine (SVM) with an RBF kernel performs the classification.	Accuracy: 95.02%. Dataset: Custom-created dataset of 803 basil leaf images across 5 classes.	Strengths: Addresses lack of a standard dataset by creating a new one; hybrid model outperforms standalone CNN. Limitations: Small dataset size; uses a classical classifier (SVM) instead of a more modern Transformer head.
Enhanced leaf disease detection: UNet for segmentation and optimized EfficientNet for disease classification	Kotwal J, et al.	2024, Software Impacts	UNet + Optimized EfficientNet: UNet segments the disease region, followed by an optimized EfficientNet (AD-ENet) for classification.	Accuracy: 99.91%. Precision: 99.87%. Recall: 99.81%. Dataset: PlantVillage and a custom Indian Soybean dataset.	Strengths: Explicit segmentation step improves accuracy; optimization addresses overfitting and gradient issues. Limitations: High performance is on datasets that may not fully represent 'in-the-wild' complexity.
EMSAM: enhanced multi-scale segment anything model for leaf disease segmentation	Li J, et al.	2025, Frontiers in Plant Science	EMSAM: Hybrid ViT-CNN architecture based on Segment Anything Model (SAM) for joint segmentation and classification. Fuses global features (ViT) and local features (CNN).	Accuracy: 87.86%. Dice: 79.25%. IoU: 69.87%. Dataset: A new annotated subset of PlantVillage (PSD, 5200 images).	Strengths: State-of-the-art segmentation performance; robust across different disease severities; establishes a new benchmark. Limitations: Higher computational cost; relies on PlantVillage data, limiting real-world generalization.
Leveraging deep learning for plant disease and pest detection: a comprehensive review and future directions	Shoaib M, et al.	2025, Frontiers in Plant Science	Review Paper: Surveys deep learning models (CNNs, FCNs, U-Nets, Mask R-CNN) for plant disease classification, detection, and segmentation.	Accuracy: Notes that classification models often exceed 95% and segmentation models exceed 90% precision on benchmark datasets.	Strengths: Comprehensive overview of the field. Limitations: Highlights key challenges like data scarcity, environmental variability, and the lab-to-field gap.
Vision Transformer with Mixture of Experts for addressing the lab-to-field gap in plant disease classification	Zafar Salman, et al.	2025, Frontiers in Plant Science	ViT + Mixture of Experts (MoE): A ViT backbone combined with an MoE framework where specialized experts are trained for different data aspects (e.g., imaging conditions).	Accuracy: 20% improvement over baseline ViT; 68% accuracy on cross-domain (PlantVillage to PlantDoc) evaluation. Dataset: PlantVillage, PlantDoc.	Strengths: Directly tackles the lab-to-field generalization problem; MoE enhances adaptability and robustness. Limitations: Increased model complexity.
Tulsi Leaf Disease Detection Using AI & Machine Learning	Not specified	Not specified, Aislyn Project Page	Standard CNNs: Compares InceptionV3, ResNet50, and VGG16 for classification.	Accuracy: Not specified, but states the goal is to select the most accurate model. Dataset: TulsiDoc dataset from Mendeley (1000 samples).	Strengths: Addresses the specific domain of Tulasi leaves. Limitations: Not a research paper; uses foundational CNN models without novel contributions.

Table 1: Comprehensive Literature Survey Summary