Concrete surface cracks are a major sign of structural safety and deterioration. Regular structure inspection and surface crack monitoring are crucial to maintaining the buildings' structural health and dependability. Human surface examination takes time and could result in erratic results because inspectors have different empirical backgrounds. Deep learning methods for visual assessment of surface cracks on civil structures have generated a lot of attention in the field of structural health monitoring. However, these vision-based algorithms depend on powerful computing resources and demand high-quality photos as inputs for image categorization. Thus, a comparison of various deep learning models is done in this work. Inception-v3 is a 48-layer deep convolutional neural network. A pretrained version of the network which has been trained on more than a million images is present in the ImageNet database. The pretrained network can categorize photos into 1000 different object categories, including several animals, a keyboard, a mouse, and a pencil. The network has therefore acquired rich feature representations for a variety of images. The network accepts images of a 299 by 299 resolution. The suggested paradigm facilitates the use of deep learning methods with low-power computing devices for trouble-free civil structure monitoring. The effectiveness of the suggested model is contrasted with that of additional well-liked deep learning methods, like VGG16 and straightforward CNN. The proposed model was determined to have a minimum computation accuracy of 99.8%. Even with a short layer stack for improved computation, a CNN architecture with better hyperparameter optimization produces higher accuracy. The evaluation findings show that the suggested method can be used with autonomous devices, including unmanned aerial vehicles, for real-time surface crack inspection with less computation.