【Deep Learning OCR Series·10】OCR dataset construction and annotation
High-quality datasets are the foundation for training excellent OCR models. This article provides a comprehensive overview of the complete process of OCR data collection, annotation tools, quality control, and data enhancement, as well as how to build domain-specific datasets.
📅 2025-08-19
👁️ 1326 reads
【Deep Learning OCR Series 9】End-to-end OCR system design
The end-to-end OCR system optimizes text detection and recognition uniformly for higher overall performance. This article details system architecture design, joint training strategies, multi-task learning, and performance optimization methods.
📅 2025-08-19
👁️ 1340 reads
【Deep Learning OCR Series·8】Detailed explanation of text detection algorithms
Detailed introduction to text detection algorithms, including mainstream detection methods such as EAST, DBNet, and PSENet. Dive into how to accurately locate text areas in complex scenes.
📅 2025-08-19
👁️ 1698 reads
【Deep Learning OCR Series·7】CTC Loss Function and Training Techniques
The principle, implementation and training techniques of CTC loss function, and the core technology to solve the sequence alignment problem. Dive into forward-backward algorithms, decoding strategies, and optimization methods.
📅 2025-08-19
👁️ 1412 reads
【Deep Learning OCR Series·6】In-depth analysis of CRNN architecture
Detailed analysis of CRNN architecture, including CNN feature extraction, RNN sequence modeling, and complete implementation of CTC loss function. Dive into the perfect combination of CNN and RNN.
📅 2025-08-19
👁️ 1534 reads
【Deep Learning OCR Series·5】Principle and Implementation of Attention Mechanism
Delve into the mathematical principles of attention mechanisms, multi-head attention, self-attention mechanisms, and specific applications in OCR. Detailed analysis of attention weight calculations, position coding, and performance optimization strategies.
📅 2025-08-19
👁️ 1384 reads
【Deep Learning OCR Series·4】Recurrent Neural Networks and Sequence Modeling
Dive into the application of RNN, LSTM, GRU in OCR. Detailed analysis of the principles of sequence modeling, solutions to gradient problems, and the advantages of bidirectional RNNs.
📅 2025-08-19
👁️ 1223 reads
【Deep Learning OCR Series·3】Detailed explanation of the application of convolutional neural networks in OCR
This section introduces the principles of convolutional neural networks and their applications in OCR, including core technologies such as feature extraction, pooling operations, and network architecture design.
📅 2025-08-19
👁️ 1432 reads
【Deep Learning OCR Series·2】Deep learning mathematical fundamentals and neural network principles
The mathematical foundations of deep learning OCR include linear algebra, probability theory, optimization theory, and the basic principles of neural networks. This paper lays a solid theoretical foundation for subsequent technical articles.
📅 2025-08-19
👁️ 1198 reads
【Deep Learning OCR Series·1】Basic concepts and development history of deep learning OCR
The basic concept and development history of deep learning OCR technology. This article details the evolution of OCR technology, the transition from traditional methods to deep learning methods, and the current mainstream deep learning OCR architecture.
📅 2025-08-19
👁️ 1274 reads