LLM = Large Language ModelAI trained on massive amounts of text dataCan understand and generate human languageAnalogy:- Regular person: Read 100 books- LLM: "Read" the entire internetResult:- Regular person: Can answer various questions- LLM: Can do more, handle complex tasks
Input: Entire internet's text (trillions of tokens)Training goal:Given previous words, predict the next wordExample:Input: "Today's weather is"Output: "great" (learned from "Today's weather is great...")After repeating trillions of times:AI learns language rulesLearns knowledgeLearns reasoning
User input: "What is blockchain?"AI thought process:1. Understand question: "User wants blockchain definition"2. Retrieve knowledge: "Blockchain is..."3. Organize language: Explain in accessible way4. Generate response: "Blockchain is a type of..."
Token = Smallest unit AI processes textExample:"hello world" → ["hello", " world"]→ 2 tokensChinese characters:"你好世界"→ Might be 4 tokens (one character = one token)Estimate:1000 tokens ≈ 750 English words1000 tokens ≈ 400-500 Chinese characters
Context window = How much AI can "remember"Example:- GPT-3.5: 4K tokens- GPT-4: 128K tokens- Claude 3: 200K tokensAnalogy:Context = AI's "short-term memory"Exceed and it forgets