Skip to content
View jiangnanboy's full-sized avatar

Block or report jiangnanboy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jiangnanboy/README.md

👋 Hello, I'm jiangnanboy

An AI application developer and explorer focused on industrial landing of artificial intelligence. Currently, my core research and practice focus on LLM application implementation, RAG question-answering, multi-agent systems, and lightweight model inference deployment, with rich practical experience in NLP, OCR, document intelligence and knowledge graph.




🔭 Core Focus

  • 🤖 LLM / RAG / Multi-Agent Systems (LangChain & LangGraph)
  • 📝 Natural Language Processing & Text Intelligence
  • 🖼️ Document Image Processing & Computer Vision
  • 📊 Data Mining, Search Engine & Personalized Recommendation
  • 🧩 Knowledge Graph Construction & Intelligent Q&A

🌐 Personal Links


🚀 Online Projects

Project Name Introduction Online Address
TalkSheet Excel Intelligent Analysis 🔗
RAG QA System Industrial Intelligent Q&A 🔗
Multi-Agent System Office collaborative intelligent agent platform 🔗



📦 My Projects

🖥️ Desktop Software & Open Source Libraries

Project Description
Java-OCR Engine Library Printed & handwritten recognition, layout detection, table analysis
OCR & Table Tool All-in-one OCR, table recognition, screenshot capture & export
Video to PPT/PDF Converter Generate editable PPT from video with deduplication & clip extraction
Document Comparison Tool Multi-format document contrast analysis & intelligent report generation
Document Image Processor Image enhancement, distortion correction, shadow removal & edge cropping

🧠 LLM & Agent Applications

Project Description
clinic-agent Clinic Intelligent Appointment and Registration Customer Service System
LangChain & LangGraph Agent System End-to-end multi-agent design and industrial practice
Medical Diagnosis Assistant Domain-oriented medical consultation auxiliary model
Food Safety Analysis Ingredient nutrition & food safety intelligent detection
LLM Inference Benchmark Performance evaluation and pressure testing for LLM service
ChatExcel Natural language data analysis & Q&A for spreadsheets
Contract Review System Intelligent contract risk identification and content auditing
AI Writing & Summarization Article rewriting, abstracting and content creation
LLM Knowledge Graph Automatic KG construction from text, image and PDF
LLM Dataset Generation High-quality corpus synthesis for model training
Customer Service Assistant Lightweight intelligent customer service robot
PDF Multimodal RAG Cross-page retrieval and multimodal document Q&A
DataFine Text filtering, data cleaning and content governance
Pediatric LLM QA Small professional model for pediatric consultation scenarios
LLM Corpus Quality Control Chinese corpus filtering and quality assessment for pre-training
LLM Input Security Prompt risk detection and harmful content filtering
Math Reasoning Agent Enhance LLM logical arithmetic and reasoning ability
Movie Graph Agent LLM + Neo4j for graph retrieval and knowledge Q&A
Image Key Information Extraction OCR + LLM hybrid structured information extraction

📝 Natural Language Processing

Project Description
Text Security Audit Semantic sensitive content detection for multi-risk scenarios
PDF & OFD Invoice Parser Intelligent structured extraction of invoice documents
BERT ONNX Classification Java-side lightweight NLP inference based on pre-trained models
Java Text Security Detection Lightweight content risk inspection for enterprise Java systems
Chinese Offensive Language Detection ONNX deployed abusive and negative text recognition
Ad Content Recognition TextCNN based advertisement identification model
Java Ad Filtering Industrial advertisement text filtering component
TextCNN ONNX Java Cross-platform deployment of text classification model
AutoText Tool Batch text standardization and automatic processing
Template Grammar Correction Rule-based Chinese grammar error correction
T5 ONNX Corrector Efficient Chinese spelling correction with lightweight model
Punctuation Restoration Automatic punctuation supplement for raw texts
Intent & Slot Joint Model Core module for task-oriented dialogue system
Text Intent Classification Multi-scene user demand recognition
Education Knowledge Tagging Automatic labeling of educational knowledge points
Chinese Chatbot Daily dialogue robot with hybrid strategy
Simple Classification Chatbot Lightweight text dialogue based on DNN
Chinese Text Deduplication Efficient article duplicate removal and similarity matching
ALBERT-LSTM-CRF NER High-precision Chinese entity recognition
Word Semantic Similarity Calculation of lexical semantic correlation
News Abstract Generation Key information extraction and news summarization
Event & Triple Extraction Complex event analysis and semantic triple extraction
Chinese Text Corrector Practical open-source Chinese error correction toolkit
Model to ONNX Convert transformer models for edge deployment
MacBERT Java Inference Java integrated Chinese spelling correction service
RoBERTa Java Inference Cross-platform Java NLP reasoning solution
Entity Linking Prediction Entity disambiguation and knowledge base linking
Geographic Location Parser Administrative division extraction and coding matching
Sentence Paraphrasing Diversified rewriting and semantic expansion
Chinese Sentence Generation Linguistics-enhanced text rephrasing
Relation Extraction Entity relationship mining for unstructured text
Semantic Role Labeling Deep sentence semantic structure parsing
General Chinese NER Universal entity recognition for daily texts
Condition Text Generation Short text creation based on keywords & titles
Protein Interaction Prediction GNN applied to biological data mining

🧬 Knowledge Graph & Intelligent QA

Project Description
Intelligent Medical KG Medical knowledge graph, symptom retrieval and diagnosis Q&A
K12 Education Knowledge Graph Education-oriented KG visualization and knowledge reasoning
Movie Knowledge Graph Entertainment domain KG with entity & relation query
Java Movie KG QA Lightweight knowledge question answering implemented in Java
Text Event Grapher Convert unstructured text into event relationship graph
easyKG Summary of knowledge graph construction and application technology

🖼️ Image Processing & OCR

Project Description
Lightweight Table Recognition Ultra-light wired table detection and structural analysis
JiaJiaOCR Pure Java self-developed OCR recognition engine
Document Image Tool Scanned document enhancement and batch optimization
Swin-Unet Table Recognition Deep learning based complex table structure segmentation
DBNet & CRNN Java Java integrated text detection and recognition pipeline
C++ Layout Detection High-performance document layout region segmentation
Java Table OCR Image table detection, cell segmentation & content recognition
Java Document Layout Analysis Chinese document layout classification (Java)
Python Document Layout Analysis Chinese document layout detection (Python)
License Plate Recognition End-to-end vehicle plate detection and recognition
OCR PDF to DOCX Convert scanned PDF into editable document
DJL PaddleOCR Java industrial OCR based on ONNX model
SpringBoot PaddleOCR Enterprise OCR service integrated with SpringBoot
JNI PaddleOCR High-performance OCR via C++ DLL local calling

📊 Search, Recommendation & Data Mining

Project Description
Learning to Rank Search sorting model optimization and algorithm practice
Spark Data Mining Distributed big data processing and feature mining
Recommendation Algorithm Library Implementation of classic personalized recommendation models
Entropy-based Relevance Query-document matching based on information entropy
Lightweight Search Engine Simple and efficient text retrieval framework
Semantic Matching Model Cross-text semantic similarity calculation & matching

🛠️ AI Basic Tools

Project Description
jiajia-search Multilingual Lightweight & High-Performance Hybrid Search Engine
DocImg Tool Practical toolset for daily document image processing
micrograd4j Java lightweight automatic differentiation engine
j4nlp Common NLP utility library based on Java
CNN4IE Information extraction via multi-CNN structure
RNN4IE Sequence entity extraction based on recurrent network
GNN4LP Graph neural network for entity link prediction

💻 Tech Stack Badges


📫 Contact

Popular repositories Loading

  1. learning_to_rank learning_to_rank Public

    利用lightgbm做(learning to rank)排序学习,包括数据处理、模型训练、模型决策可视化、模型可解释性以及预测等。Use LightGBM to learn ranking, including data processing, model training, model decision visualization, model interpretability and …

    Python 279 71

  2. movie_knowledge_graph_app movie_knowledge_graph_app Public

    电影知识图谱,主要包括实体识别、实体查询、关系查询以及智能问答等。movie knowledge graph(Entity identification, graph display, and intelligent question and answer)

    JavaScript 142 30

  3. albert_lstm_crf_ner albert_lstm_crf_ner Public

    albert + lstm + crf实体识别,pytorch实现。识别的主要实体是人名、地名、机构名和时间。albert + lstm + crf (named entity recognition)

    Python 136 32

  4. education_knowledge_graph_app education_knowledge_graph_app Public

    Education knowledge graph(graph display, knowledge point tracking, intelligent question and answer,questions knowledge point prediction)。k12教育学科知识图谱,图谱展示,知识点追踪,智能问答以及题目知识点预测。

    JavaScript 134 25

  5. Doc-Image-Tool Doc-Image-Tool Public

    文档图像处理工具(Document image processing tool),包括漂白 / 文字方向矫正 / 清晰增强 / 笔记去噪美化 / 去阴影 / 扭曲矫正 / 切边增强(DocBleach / TextOrientationCorrection / DocSharpening / HandwritingDenoisingBeautifying / DocShadowRemoval…

    Python 130 19

  6. intelligent_medical intelligent_medical Public

    intelligent medical,智慧医疗,包括疾病搜索、相关推荐、疾病医疗问答以及智能疾病诊断等功能。

    Java 91 21