The detection of dim small ground targets in remote sensing images has problems of less target information and mixed information. To address these issues, a detection algorithm based on the multi-level feature fusion is proposed in this article, which is named as CC-YOLO. Firstly, the deep convolution neural network is used to extract features of the target image step by step, and the high-level and low-level feature spatial pyramid is obtained. Then, the cross-level channel feature fusion is implemented on the spatial pyramid, and features are aggregated along two spatial directions. The newly added CA is combined to retain the accurate location information of dim small targets. Finally, the end-to-end target detection method is implemented on the dual feature map generated after aggregation. And the detection results are output by combining multi-channel detection information. To solve the problem of lacking image data in algorithm experiment, this article establishes the ground-based dim small target dataset ( GDSTD) of remote sensing image. Experimental results show that the proposed algorithm achieves 42. 3% at AP0. 5 ∶0. 95 and 94. 6% at AP0. 5 , and the detection rate FPS reaches 58. 8 frames / s, which has certain robustness and real-time performance.