Prompt Details
Model
(gemini-3.1-pro)
Token size
275
Example input
[Target Model]: Llama-3-8B
[Available VRAM]: 24GB
[Dataset Format]: JSONL (Alpaca format)
[Training Goal]: Korean Financial Domain Adaptation
Example output
### ■ Memory Strategy & LoRA Parameters
* Estimated Footprint: Base model (4-bit) ~5.5GB + Gradients/Optimizer ~4GB. Total ~9.5GB (Extremely safe for 24GB VRAM).
* LoRA Config: r=32, lora_alpha=64, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] (Deep fine-tuning for complex financial terminology).
* Precision: 4-bit NormalFloat (NF4) with double quantization.
### ■ Advanced Hyperparameter Config
* Optimizer: `paged_adamw_8bit` (Prevents memory spikes).
* Batching: per_device_train_batch_size=4, gradient_accumulation_steps=4 (Effective Batch=16).
* Gradient Checkpointing: ENABLED. Flash Attention 2: ENABLED.
### ■ Python SFTTrainer Script (Copy & Paste Ready)
import torch
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", quantization_config=bnb_config, attn_implementation="flash_attention_2", device_map="auto")
model.gradient_checkpointing_enable()
peft_config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], bias="none", task_type="CAUSAL_LM")
training_args = TrainingArguments(output_dir="./llama3-finance", per_device_train_batch_size=4, gradient_accumulation_steps=4, optim="paged_adamw_8bit", learning_rate=2e-4, bf16=True, max_steps=1000)
dataset = load_dataset("json", data_files="finance_data.jsonl")["train"]
trainer = SFTTrainer(model=model, train_dataset=dataset, peft_config=peft_config, args=training_args)
trainer.train()
By purchasing this prompt, you agree to our terms of service
GEMINI-3.1-PRO
Stop guessing your hyperparameter settings. This advanced prompt acts as a Senior AI Engineer, calculating VRAM footprints and generating a complete, copy-paste ready PyTorch/PEFT SFTTrainer Python script. It optimizes for 4-bit QLoRA, Flash Attention 2, and PagedAdamW to prevent OOM errors across any consumer or enterprise GPU setup. Perfect for AI researchers and developers.
...more
Added over 1 month ago
