try to craft my own wiki of AI era.
Theory and Foundation Layer
- math fundamentals
- CS and programming fundamentals
- AI fundamentals
- LLM fundamentals
Basic Theory and Patterns Evolution
| Stage | Technical Focus | Capability Shift | Core Architectural Constraint |
|---|---|---|---|
| Stage 1 | Transformers / MoE | Large-scale language processing | Lack of intent alignment or reasoning |
| Stage 2 | Instruction Fine-Tuning | Improved alignment with user goals | Brittle across diverse or novel tasks |
| Stage 3 | RLHF | Human-centric value alignment | Highly dependent on human evaluation |
| Stage 4 | Tool Integration | Active capability via external APIs | Lack of autonomous planning/memory |
| Stage 5 | RAG | Real-time factual grounding | Static knowledge and grounding issues |
| Stage 6 | Single-Agent Autonomy | Autonomous planning and execution | Limited to sequential, linear problem-solving |
| Stage 7 | Multi-Agent Collaboration | Distributed, specialized orchestration | High coordination and state complexity |
| Stage 8 | Persistent Expert Agents | Long-term learning and domain expertise | Ongoing research into self-evolving memory |
LLM Ops
- MLOps: Abstract out the common computing/storage layer, taking care of capacity, scheduling, scaling, and load balancing.
- compute layer: GPU cluster setup and management to fully utilize the hardware
- Scaling: Automatic and seamless scaling up and down, from on-premise to cloud when needed
- Scaling: Support multiple models with dynamic model loading
- Operations: Monitor usage of computing resources, and status of training and inference jobs
- Operations: Generate data for usage stats and metrics dashboard, and alert when anomaly detected
- Platform: Training / Fine-tuning: improve training throughput, reliability and efficiency
- Platform: Inference: Leverage the latest and most efficient open source framework for LLM inference to reduce latency and improve throughput
- Platform: Evaluation and Benchmarking: automatically evaluate models' performance on datasets of interests
- Platform: A/B Testing: capability for online A/B testing to compare features
- Unified AI Gateway: Abstract out the common API/SDK layer, taking care of authentication, authorization, rate limiting, error handling, logging, monitoring, and alerts.
LLM Train
- pre-training
- post-training
- LLM knowledge distillation
LLM Inference
- GPU resource management
- API / SDK encapsulation
- rate limiting
- error handling
- logging
- monitoring
- alerts
- notifications
LLM Fine-tune
- prefix fine-tuning, prompt tuning, variants
- SFT
- RLHF / RLAIF / DPO variants
- LoRA and QLoRA variants
LLM RAG
basic patterns
- dense vector-based RAG
- sparse vector-based RAG
- graph-based RAG
SOP
- ingest documents, chunking and embedding (structured data) with strategies
- recall with hybrid search
- format, references and citations
- re-rank, query-rewriting, multi-hop, graph or table augmentation
- composable and modular RAG system architecture
- domain-specific retrieval pipelines; continuous ingestion
- quality metrics, evals and quality dashboards
LLM Prompting Engineering
- prompting engineering BP for human -> write the best prompts for your tasks
- classic patterns: one / few shots, chain-of-thought, self-consistency, reAct etc.
- prompt management (version, testing, validation, safety, etc.)
- AI driven prompting optimization (prompting refine by AI and auto.)
- DSPy, textGuard, promptWizard, GRAD-SUM, ell, StarGo ...
Agentic System Context Engineering
- write context
- memories
- state
- scratch-pad
- ...
- select context
- tools retrieval
- docs / knowledge retrieval
- memory retrieval
- memo0 example for long-term memory management
- ...
- compress context
- prompt compression (information compression)
- summary by LLM
- trim by rules
- ...
- isolate context
- in state
- hold in environment / sandbox
- partition among agents
- ...
Make agent select tools to organize and mange its runtime context (CURD operation is maintained by agent itself)
LLM Select
- pick and compose right LLMs for the task
- model family selection
- open-source LLMs family
- commercial LLMs family
- latency, cost, throughput, quality, etc.
- model family selection
- LLM parameters (tokens, top-p, temperature, etc.)
LLM Agentic Systems
Core Patterns
- reasoning: CoT · BDI (Belief, Desire, Intention) · ReAct
- goal: passive goal creator · proactive goal creator
- planning: single / multi-path plan generator · plan and execute framework · graph-based control flow
- retrieval: RAG · knowledge and RAG enhancements
- reflection: self-reflection and refinement · cross-reflection · human reflection
- cooperation: voting / role / debate based · tool / agent registry
- execution: serial vs. parallel tool execution · tool execution sandbox · agent evaluator · multi-modal guardrails
- optimization: prompt / response optimizer
reference practical patterns:
Memory
- short-term vs. long-term memory
- storage backends: vector store · graph DB · relational DB · file systems
- structure: graph-based vs. tree-based
- A-MEM: Dynamic and Self-Evolving memory
- context-sizing control
Tools & Skills
- tool-call and skills management
- code execution · html / web-page generation · browser-use · VM use · web search
- multi-step workflow
- traditional multi-step workflow
- agent skills (fixed patterns as sub-agents) — skills BP for engineering
Agentic Flow & Interface
- agentic-flow prompting
- ReAct agent
- reflection × planning × action
- RPA loop: perception × reasoning × action
- Effective HITL (Human in the Loop)
- user-interface customization
- continuous learning loop (telemetry → evals → prompt / knowledge updates)
Reliability & Safety
- human-in-loop (HITL) · basic principles for agent build
- hallucination prevention and mitigation
- safety, security, compliance, governance
- content filters · PII redaction · secure key management
- prompt injection defenses · retrieval hygiene · tool permissioning
- policy layers (allow/deny lists) · sensitive actions with human approval
- compliance: data retention · audit trails · red-team exercises
Performance & Cost
- metrics: cost · latency · throughput · prompting logs · tool-call logs
- token budgeting · caching · short prompts · prompt cache
- reranking before generation · response compression · approximate search tuning
- distillation / routing to small models · speculative decoding
- SLAs with adaptive quality tiers · cost/perf dashboards
Multi-Agent Systems (MAS)
- topology: centralized vs. decentralized · hierarchical vs. flat · serial vs. parallel · supervisor vs. peer
- memory sharing: Blackboard Model · state-based vs. memory-based
- storage: vector store · graph DB · relational DB
- communication protocol: end-to-end · broadcast · shared-memory channels
- tool invocation protocol: MCP (Model Context Protocol)
- human roles in the agentic loop: supervisor · loop participant · meta-agent
LLM Product Engineering
Classic Protocols
- MCP (Model Context Protocol)
- A2A (Agentic to Agentic Protocol) with ADK
- A2UI Protocol widgets and components render from AI
- Ag-UI (Agentic UI Protocol)
- Agent to Editor (Client) Protocol
Frameworks
- ai-sdk (node / javascript)
- LangChain (python) / LangGraph (python)
- AutoGPT
- AgentOps
- MetaGPT
- CrewAI
- ...
| Feature | CrewAI | LangGraph | AutoGen |
|---|---|---|---|
| Primary Approach | Role-based / Team structure | Graph-based / State machine | Conversation-based interaction |
| State Management | Central Orchestrator | Strong-typed Stateful Graphs | Contextual Memory Engine |
| Task Allocation | Bidding Mechanism / Role | Predefined Node Transitions | Iterative Agent Dialogue |
| Complexity Level | Intuitive / Low-to-Moderate | Advanced / High Control | Modular / Moderate-to-High |
| Best Use Case | Cross-functional projects | Supply chain / Data pipelines | Software development / Coding |
Platforms
Model Services Vendors:
- Open Router
- Claude / Gemini / Grok / OpenAI / DeepSeek / ...
LLM Orchestration Platforms:
- OpenAI Agent Builder
- Dify / Coze
- n8n
- Gumloop (AgentHub)
Observation: Monitoring real-time agent actions, including tool usage and reasoning paths.
- LangSmith
Test and Evaluation
- Langfuse
- PromptFoo
LLM Deep Scenarios
AI First product systems
VibeCoding
- basic principles and manifesto
OpenSource research:
- Gemini CLI
- Cursor
Manus - General Agentic System
patterns:
- monolithic
- pipeline sub-systems
- multi-agent sub-systems (MoA)
- hybrid mixed
info resources:
- domain-specific / public information retrieval
context:
- memory management
- context management / compress and optimize
plan strategies
- static workflow
- intent to plan
- unified intent planning
OpenSource research:
- OpenManus
DeepResearch
- OpenResearch
NoteBook
- Google Notebook ML
MultiModal
- Gen Image
- Gen Video
- Gen Audio
- Gen 3D objects
Reference
trace
- (26-01-04) add more products and frameworks to the wiki
- (26-02-07) add more details about LLM Ops and Infra.
- (26-03-17) provide clean structure and content for the agentic section.