콘텐츠로 이동

시스템 다이어그램

SeMu-GPT 2026의 컴포넌트 구성, 데이터 흐름, 인프라 토폴로지, 배포 파이프라인을 한 페이지에 모아둔다. 용어·컴포넌트 책임은 아키텍처 개요 페이지 참조.


1. 전체 컴포넌트 다이어그램

graph TD
    subgraph Users["사용자"]
        U[웹 브라우저
데스크톱·모바일] end subgraph EdgeDev["dev 엣지"] CFW[Cloudflare Workers
semu-chat-dev] CFT[Cloudflare Tunnel
semu-gpt-dev] end subgraph EdgeProd["prod 엣지 (soft launch)"] ALB[AWS ALB
semu-gpt-alb] CFLEG[CloudFront
레거시 apex/www] end subgraph FE["Frontend (Next.js 16)"] APP[App Router
채팅·결제·관리자] SSE[SSE Client
RAG 스트림 수신] end subgraph BE["Backend (Spring Boot 3.1)"] CTRL[Controllers
Auth·Account·Conversation·
Toss·Admin·Reference] SVC[Services
StreamingConversationService
StreamingRagProcessor
HydeService] REPO[Repositories
JPA + ES] SEC[Security
JWT TokenProvider/Parser] end subgraph Stores["데이터 저장소"] MY[(MySQL
account·membership·payment·
conversation·coupon)] ES[(Elasticsearch 8.17
tax-law·tax-precedent·
tax-tribunal·tax-counsel·tax-threeway)] RD[(Redis 7
캐시)] end subgraph Pipeline["Data Pipeline (Python CLI)"] COL[Collectors
국세청·법제처·찾아줘세무사·심판원] IDX[Indexers
ES bulk + 임베딩] end subgraph Ext["외부 서비스"] OAI[OpenAI
GPT-5 / Embedding] LF[Langfuse] TOSS[Toss Payments] SENS[Naver Cloud SENS
SMS] SEN[Sentry] end U --> CFW U --> ALB U --> CFLEG CFW --> APP ALB --> APP APP <--> SSE APP -- HTTP REST --> CTRL SSE -- SSE --> CTRL CFT -- :8080 --> CTRL ALB -- :8080 --> CTRL CTRL --> SEC CTRL --> SVC SVC --> REPO REPO --> MY REPO --> ES SVC --> RD SVC -- 프롬프트 fetch --> LF SVC -- chat / embedding --> OAI CTRL -- confirm/webhook --> TOSS SVC -- SMS --> SENS APP -- 에러 --> SEN COL -- JSONL --> IDX IDX -- bulk index --> ES IDX -- embedding --> OAI

2. RAG 질의 응답 흐름 (SSE 스트리밍)

POST /conversations/stream 또는 POST /conversations/{id}/turns/stream 호출 시의 end-to-end 흐름. 참고자료 검색·필터링·보충 로직은 프로젝트 루트 CLAUDE.md의 "RAG 참고자료 아키텍처" 섹션 참조.

sequenceDiagram
    autonumber
    participant U as 사용자
    participant FE as 프론트엔드
    participant API as Backend (StreamingConversationController)
    participant SVC as StreamingConversationService
    participant RAG as StreamingRagProcessor
    participant LF as Langfuse
    participant LLM as OpenAI
    participant ES as Elasticsearch
    participant DB as MySQL

    U->>FE: 질문 입력
    FE->>API: POST /conversations/stream
Authorization: Bearer API->>SVC: streamAnswer(question, sessionId?) SVC->>DB: ConversationSession 생성/조회 SVC->>LF: getPromptWithConfig(hyde-generator) SVC->>LLM: HyDE 가상 답변 생성 LLM-->>SVC: 가상 답변 텍스트 SVC->>LLM: text-embedding-3-large LLM-->>SVC: vector(3072) SVC->>RAG: search(question, vector, taxCategory) RAG->>ES: BM25 + kNN (RRF 합산)
tax-law / precedent / tribunal / counsel ES-->>RAG: hits per index RAG->>RAG: dedup → category backfill →
tier sort → 주변법 제거 → max 5/type RAG-->>SVC: ranked references SVC->>LF: getPromptWithConfig(rag-final-answer) SVC->>LLM: ChatCompletion (stream=true) loop 토큰 스트림 LLM-->>SVC: token chunk SVC-->>FE: SSE: data chunk FE-->>U: 점진 렌더링 end LLM-->>SVC: 답변 완료 + id1,id2 SVC->>ES: supplementThreewayReferences(article_keys) SVC->>ES: supplementReferences(답변 텍스트 추출) SVC->>SVC: reorderByCitations() SVC-->>FE: SSE: citation_update SVC->>DB: ConversationTurn persist (질문·답변·refs) SVC-->>FE: SSE: done

3. 결제 흐름 (Toss V2 — 단건 결제 + Webhook)

상세 비즈니스 룰은 결제 (Toss V2) 페이지 참조.

sequenceDiagram
    autonumber
    participant U as 사용자
    participant FE as 프론트엔드
    participant TW as Toss Widget (브라우저 SDK)
    participant API as Backend
    participant TOSS as Toss Server
    participant DB as MySQL

    U->>FE: 멤버십 플랜 선택
    FE->>TW: tossPayments.requestPayment(...)
NEXT_PUBLIC_TOSS_CLIENT_KEY TW->>U: 결제창 (카드/계좌/간편) U->>TW: 결제 수단 입력 TW->>FE: success → /payment/success?paymentKey&orderId&amount FE->>API: POST /payments/confirm
{ paymentKey, orderId, amount } API->>TOSS: POST /v1/payments/confirm
Basic auth (TOSS_SECRET_KEY) TOSS-->>API: { status: "DONE", method, totalAmount, ... } API->>DB: Payment.save(SUCCESS) API->>DB: Membership.activate(plan, period) API-->>FE: { paymentId, membership } FE-->>U: 결제 완료 화면 Note over TOSS,API: 비동기 webhook (재시도 포함) TOSS->>API: POST /webhooks/tosspayments
(TossWebhookController) API->>API: 시그니처 검증 API->>DB: Payment 상태 동기화
(취소·부분취소·실패 처리) API-->>TOSS: 200 OK

4. 데이터 파이프라인 인덱싱 흐름

오프라인 CLI 작업. 자세한 단계는 데이터 파이프라인 참조.

sequenceDiagram
    autonumber
    participant DEV as 개발자 (로컬)
    participant COL as Collector (Python)
    participant SRC as 외부 사이트
(국세청·법제처 등) participant FS as 로컬 파일시스템
(data/{source}/*.jsonl) participant IDX as Indexer (Python) participant OAI as OpenAI Embedding participant ES as Elasticsearch DEV->>COL: uv run semugpt-collect {source}
--max-items N loop 페이지 단위 COL->>SRC: HTTP / Selenium fetch SRC-->>COL: HTML / JSON / PDF COL->>COL: parse + normalize COL->>FS: append JSONL + .progress 저장 end DEV->>IDX: uv run semugpt-index bulk
--index {source} --es-url ... --embed IDX->>FS: read JSONL (resumable) loop batch IDX->>OAI: text-embedding-3-large
(content) OAI-->>IDX: vector(3072) IDX->>ES: _bulk index
tax-{source} end ES-->>IDX: indexed count IDX-->>DEV: 진행률 + 통계

5. 인프라 토폴로지 (dev)

semu-gpt-dev.bootalk.co.kr (백엔드) + semu-chat-dev.bootalk.co.kr (프론트엔드)는 서로 다른 인프라에 산다.

graph TD
    USER[개발자 / 클라이언트
웹 브라우저] subgraph CF["Cloudflare"] CFDNS[DNS: bootalk.co.kr zone] CFW[Workers
semugpt-frontend-dev] CFKV[KV
Next.js 정적 자산] CFTUN[Tunnel: semugpt-backend
id 078d8083-...] end subgraph AWS["AWS Lightsail (계정 023888247019, ap-northeast-2a)"] LS[Instance: semugpt-backend
medium_3_0 / 4GB / 80GB SSD
Static IP 3.39.17.132] subgraph SystemD["systemd 서비스"] CFD[cloudflared] BE[semugpt-backend
Spring Boot :8080] HM[health-monitor cron] DA[disk-alarm cron] end subgraph Docker["Docker Compose"] MY[(mysql:8.0
:3306)] ES[(semugpt-es:8.17.0-nori
:9200)] RD[(redis:7-alpine
:6379)] end SLACK[Slack Webhook
알림 전송] end USER -- semu-chat-dev --> CFW CFW --> CFKV CFW -- NEXT_PUBLIC_API_URL --> CFTUN USER -- semu-gpt-dev --> CFTUN CFDNS -.-> CFW CFDNS -.-> CFTUN CFTUN -- :8080 --> CFD CFD --> BE BE --> MY BE --> ES BE --> RD HM --> SLACK DA --> SLACK

6. 인프라 토폴로지 (prod, soft launch — Issue #151)

레거시 tax-gpt 와 신규 semugpt-2026 가 같은 ALB · RDS 를 공유하면서 병행 운영.

graph TD
    LEGUSER[레거시 사용자
semugpt.co.kr / www / api / pro] NEWUSER[신규 사용자
new.semugpt.co.kr
api-new.semugpt.co.kr] R53[Route 53
semugpt.co.kr zone] subgraph LegFront["레거시 프론트엔드 (변경 없음)"] CFRONT[CloudFront × 2
EQH9... / EMX0...] S3[(S3 buckets
semugpt.co.kr
semugpt-hosting)] end subgraph ALBLayer["ALB 계층 (재사용)"] ALB[semu-gpt-alb
HTTPS:443 ACM *.semugpt.co.kr] TGLEG[TG semu-gpt-instance
:80] TGNEWBE[TG-backend-2026
:8080 priority 100] TGNEWFE[TG-frontend-2026
:3000 priority 110] end subgraph LegBE["레거시 백엔드"] EC2[EC2 i-07aea... t2.medium
tax-gpt Spring Boot :80] end subgraph NewBE["신규 Lightsail prod (계획)"] LSP[semugpt-prod
large_3_0 / 8GB / $44/mo] SBE[systemd: semugpt-backend :8080] SFE[systemd: semugpt-frontend :3000] DES[(Docker: ES + Redis)] end subgraph RDSLayer["RDS (공유)"] RDS[(tax-gpt MySQL 8.0.44
db.t3.micro / 20GB)] DBOLD[(database tax_gpt
레거시 live)] DBNEW[(database semugpt_2026
account 324 + phone 351 + membership 642)] end LEGUSER --> R53 NEWUSER --> R53 R53 -- apex/www --> CFRONT CFRONT --> S3 R53 -- api/pro --> ALB R53 -- new/api-new --> ALB ALB --> TGLEG ALB --> TGNEWBE ALB --> TGNEWFE TGLEG --> EC2 TGNEWBE --> SBE TGNEWFE --> SFE SBE --> DES SBE --> RDS EC2 --> RDS RDS --- DBOLD RDS --- DBNEW LSP --- SBE LSP --- SFE LSP --- DES

주의: 레거시 RDS 보안 그룹 sg-09b20a06...0.0.0.0/0:3306 으로 인터넷에 공개돼 있음. Hard cutover 후 잠글 예정 (CLAUDE.md "결정/미해결 사항" 참조).


7. 배포 파이프라인

graph LR
    DEV[개발자]
    GH[GitHub
uitiorg/semugpt-2026] subgraph FrontPath["프론트엔드 배포 (자동)"] GHA[GitHub Actions
deploy-frontend-dev.yml] BUILD[pnpm build:cf
opennextjs-cloudflare] DEPLOY[wrangler deploy] WORKER[Cloudflare Worker
semugpt-frontend-dev] end subgraph BackPath["백엔드 배포 (수동)"] SSH[ssh semugpt-aws] PULL[git pull origin develop] GRADLE[Gradle bootJar 자동] SYSCTL[sudo systemctl restart
semugpt-backend] end subgraph DataPath["데이터 파이프라인 (로컬 CLI)"] UVCOL[uv run semugpt-collect ...] UVIDX[uv run semugpt-index bulk ...] ESDIRECT[(Elasticsearch
dev / prod)] end DEV -- git push develop --> GH GH -- push event
paths apps/frontend/** --> GHA GHA --> BUILD BUILD --> DEPLOY DEPLOY --> WORKER DEV --> SSH SSH --> PULL PULL --> GRADLE GRADLE --> SYSCTL DEV --> UVCOL UVCOL --> UVIDX UVIDX --> ESDIRECT

배포 트리거 / 환경변수 / 자격증명 상세는 운영 배포개발 환경 페이지 참조.


관련 문서