AI Agent 101

Public

Học và xây dựng AI Agents

1 thành viên

2 khóa học

Classroom Community Calendar Members Map Leaderboards About

Đang tải...

Implementation với PEFT

from peft import LoraConfig, get_peft_model from transformers import AutoModelForCausalLM # Load base model in 4-bit model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-3-8B", load_in_4bit=True, device_map="auto" ) # Configure LoRA lora_config = LoraConfig( r=16, # Rank lora_alpha=32, # Scaling factor target_modules=["q_proj", "v_proj"], # Which layers lora_dropout=0.05, bias="none" ) # Apply LoRA model = get_peft_model(model, lora_config) model.print_trainable_parameters() # Output: trainable params: 4,194,304 (0.1% of 8B)

Parameter

Description

Typical Range

r (rank)

LoRA rank

8-64

alpha

Scaling

2x rank

target_modules

Which layers

qkv_proj, mlp

dropout

Regularization

0.05-0.1

Vấn Đề Với Full Fine-tuning

Full fine-tuning một 7B model cần:

Memory: 28GB+ VRAM (model) + 56GB+ (gradients, optimizer states)
Storage: Mỗi checkpoint 14GB
Hardware: 4x A100 80GB minimum

LoRA: Low-Rank Adaptation

Thay vì update toàn bộ weights, LoRA thêm small trainable matrices:

Original: W (d x k)
LoRA: W + BA where B (d x r), A (r x k), r << d, k

Ví dụ: d=4096, k=4096, r=8

Original params: 16M
LoRA params: 65K (250x smaller!)

QLoRA: Quantized LoRA

Combines:

4-bit quantization: Model weights lưu 4-bit thay vì 16-bit
LoRA adapters: Trainable trong 16-bit
Double quantization: Quantize quantization constants

Result: Fine-tune 65B model trên single 48GB GPU!

Implementation với PEFT

from peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

# Load base model in 4-bit
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3-8B",
    load_in_4bit=True,
    device_map="auto"
)

# Configure LoRA
lora_config = LoraConfig(
    r=16,  # Rank
    lora_alpha=32,  # Scaling factor
    target_modules=["q_proj", "v_proj"],  # Which layers
    lora_dropout=0.05,
    bias="none"
)

# Apply LoRA
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# Output: trainable params: 4,194,304 (0.1% of 8B)

Hyperparameter Tuning

Parameter	Description	Typical Range
r (rank)	LoRA rank	8-64
alpha	Scaling	2x rank
target_modules	Which layers	qkv_proj, mlp
dropout	Regularization	0.05-0.1

Pro Tips

Higher r = more capacity nhưng more params, slower training
Target all linear layers cho best results
Gradient checkpointing để reduce memory thêm

💡 Start với r=16, alpha=32. Increase r nếu underfit.

AI Agent 101

AI Agent 101

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

LoRA và QLoRA: Parameter-Efficient Fine-tuning

Vấn Đề Với Full Fine-tuning

LoRA: Low-Rank Adaptation

QLoRA: Quantized LoRA

Implementation với PEFT

Hyperparameter Tuning

Pro Tips

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

LoRA và QLoRA: Parameter-Efficient Fine-tuning

Vấn Đề Với Full Fine-tuning

LoRA: Low-Rank Adaptation

QLoRA: Quantized LoRA

Implementation với PEFT

Hyperparameter Tuning

Pro Tips

AI Agent 101

AI Agent 101

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

CHƯƠNG 1Module 1: Nền Tảng RAG Nâng Cao

CHƯƠNG 2Module 2: Kỹ Thuật Retrieval Nâng Cao

CHƯƠNG 3Module 3: Fine-tuning LLM Căn Bản

CHƯƠNG 4Module 4: Production Fine-tuning & Evaluation

LoRA và QLoRA: Parameter-Efficient Fine-tuning

Vấn Đề Với Full Fine-tuning

LoRA: Low-Rank Adaptation

QLoRA: Quantized LoRA

Implementation với PEFT

Hyperparameter Tuning

Pro Tips

Advanced RAG Architecture & LLM Fine-tuning

Tiến độ khoá học

CHƯƠNG 1Module 1: Nền Tảng RAG Nâng Cao

CHƯƠNG 2Module 2: Kỹ Thuật Retrieval Nâng Cao

CHƯƠNG 3Module 3: Fine-tuning LLM Căn Bản

CHƯƠNG 4Module 4: Production Fine-tuning & Evaluation

LoRA và QLoRA: Parameter-Efficient Fine-tuning

Vấn Đề Với Full Fine-tuning

LoRA: Low-Rank Adaptation

QLoRA: Quantized LoRA

Implementation với PEFT

Hyperparameter Tuning

Pro Tips