
§ Crop Monitoring: SAM generates segmentation masks for crops,
weeds, and pests using point/box prompts ([170]).
§ Livestock Tracking: SAM-based object tracking monitors animal
behavior (e.g., broiler bird movement patterns [171]).
§ Challenges:
§ Complex outdoor environments (e.g., occlusions, variable lighting).
§ Limited annotated agricultural datasets.
§ Implementation:
§ Domain-Specific Tuning: Fine-tuning SAM on agricultural
datasets (e.g., drone-captured crop images).
§ Multi-Modal Fusion: Integrating weather data or soil sensors with
SAM’s visual prompts for predictive analytics.
o Robotics: SAM for real-time object manipulation [112].
6. Conclusion:
o Summary: Visual prompt engineering is pivotal for AGI, enabling flexible,
efficient, and human-aligned model behavior.
o Future Work: Address generalization gaps, integrate interdisciplinary methods
(NLP + CV), and expand real-world deployments.
7. Methodology Breakdown
1. Computational Modeling (Algorithm Development)
o Definition: Designing and optimizing algorithms or architectures (e.g., vision
models, prompt-tuning methods) to address specific tasks.
o Examples:
§ Visual Prompt Tuning (VPT) [95]: Introduced task-specific learnable
prompts in input space for efficient fine-tuning of pre-trained vision models.
§ SAM (Segment Anything Model) [53]: A universal segmentation model
trained on diverse datasets using prompt engineering.
§ Multi-modal Prompt Learning (MaPLe) [119]: Combined text and
image prompts for cross-modal understanding.
o Limitations: Heavy reliance on large-scale datasets (e.g., CLIP’s 400M image-text
pairs), computational costs, and overfitting risks (e.g., CoOP’s reduced
generalization on new data).
2. Experimental Evaluation
o Definition: Testing models on benchmark datasets or real-world applications to
measure performance.
o Examples:
§ SAM in medical imaging [98, 100]: Evaluated SAM’s zero-shot
segmentation accuracy against manual clinical delineation.
§ SAM-Adapter [149]: Tested domain-specific knowledge infusion for
improved segmentation in pseudocolor object detection.
o Limitations: Context-specific performance (e.g., SAM struggles in low-contrast
environments [Page 13]) and dataset biases (e.g., remote sensing image orientation
challenges [Page 14]).
3. Systematic Literature Review