CongDongVibeCode - AI Coder Vietnam

Two-Stage Retrieval Pipeline

Stage 1: Bi-Encoder (Fast) - Retrieve top-100 candidates Stage 2: Cross-Encoder (Accurate) - Re-rank to get top-10

Tại Sao Cần Re-ranking?

Bi-Encoder limitations:

Pre-computed embeddings = no interaction between query and document
Limited by embedding quality
Fast nhưng accuracy có ceiling

Cross-Encoder advantages:

Query và document được encode cùng nhau
Full attention between all tokens
Much higher accuracy

Cách Cross-Encoder Hoạt Động

Input: [CLS] query [SEP] document [SEP]
Output: relevance score (0-1)

Model sees both texts together → understands relationships better.

Popular Re-rankers

Model	Speed	Accuracy	Notes
Cohere Rerank	Fast	Very High	API-based
BGE-reranker-v2	Medium	High	Open-source
cross-encoder/ms-marco	Slow	High	Classic choice
Jina Reranker	Fast	High	Multilingual

Implementation

from sentence_transformers import CrossEncoder

# Load model
reranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")

# Get initial candidates
candidates = vector_search(query, top_k=100)

# Re-rank
pairs = [(query, doc.content) for doc in candidates]
scores = reranker.predict(pairs)

# Sort by score
ranked = sorted(zip(candidates, scores), key=lambda x: x[1], reverse=True)
top_10 = [doc for doc, score in ranked[:10]]

Với Cohere API

import cohere

co = cohere.Client(api_key)
results = co.rerank(
    model="rerank-english-v3.0",
    query=query,
    documents=[doc.content for doc in candidates],
    top_n=10
)

Pro Tips

Limit candidates: Re-rank top-50 to top-100, không phải toàn bộ
Batching: Cross-encoder chậm, batch predictions
Caching: Cache re-rank results cho frequent queries

💡 Production: Cohere rerank cho simplicity. Self-hosted BGE-reranker nếu cần privacy.

AI Agent 101

AI Agent 101

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

Re-ranking và Cross-Encoder

Two-Stage Retrieval Pipeline

Tại Sao Cần Re-ranking?

Cách Cross-Encoder Hoạt Động

Popular Re-rankers

Implementation

Với Cohere API

Pro Tips

AI Agent 101

AI Agent 101

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

CHƯƠNG 1Module 1: Nền Tảng RAG Nâng Cao

CHƯƠNG 2Module 2: Kỹ Thuật Retrieval Nâng Cao

CHƯƠNG 3Module 3: Fine-tuning LLM Căn Bản

CHƯƠNG 4Module 4: Production Fine-tuning & Evaluation

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

CHƯƠNG 1Module 1: Nền Tảng RAG Nâng Cao

CHƯƠNG 2Module 2: Kỹ Thuật Retrieval Nâng Cao

CHƯƠNG 3Module 3: Fine-tuning LLM Căn Bản

CHƯƠNG 4Module 4: Production Fine-tuning & Evaluation

Re-ranking và Cross-Encoder

Two-Stage Retrieval Pipeline

Tại Sao Cần Re-ranking?

Cách Cross-Encoder Hoạt Động

Popular Re-rankers

Implementation

Với Cohere API

Pro Tips