💠 AI First Product Engineer Wiki

try to craft my own wiki of AI era.

Theory and Foundation Layer

math fundamentals
CS and programming fundamentals
AI fundamentals
LLM fundamentals

Basic Theory and Patterns Evolution

Stage	Technical Focus	Capability Shift	Core Architectural Constraint
Stage 1	Transformers / MoE	Large-scale language processing	Lack of intent alignment or reasoning
Stage 2	Instruction Fine-Tuning	Improved alignment with user goals	Brittle across diverse or novel tasks
Stage 3	RLHF	Human-centric value alignment	Highly dependent on human evaluation
Stage 4	Tool Integration	Active capability via external APIs	Lack of autonomous planning/memory
Stage 5	RAG	Real-time factual grounding	Static knowledge and grounding issues
Stage 6	Single-Agent Autonomy	Autonomous planning and execution	Limited to sequential, linear problem-solving
Stage 7	Multi-Agent Collaboration	Distributed, specialized orchestration	High coordination and state complexity
Stage 8	Persistent Expert Agents	Long-term learning and domain expertise	Ongoing research into self-evolving memory

LLM Ops

MLOps: Abstract out the common computing/storage layer, taking care of capacity, scheduling, scaling, and load balancing.
- compute layer: GPU cluster setup and management to fully utilize the hardware
- Scaling: Automatic and seamless scaling up and down, from on-premise to cloud when needed
- Scaling: Support multiple models with dynamic model loading
- Operations: Monitor usage of computing resources, and status of training and inference jobs
- Operations: Generate data for usage stats and metrics dashboard, and alert when anomaly detected
- Platform: Training / Fine-tuning: improve training throughput, reliability and efficiency
- Platform: Inference: Leverage the latest and most efficient open source framework for LLM inference to reduce latency and improve throughput
- Platform: Evaluation and Benchmarking: automatically evaluate models' performance on datasets of interests
- Platform: A/B Testing: capability for online A/B testing to compare features
Unified AI Gateway: Abstract out the common API/SDK layer, taking care of authentication, authorization, rate limiting, error handling, logging, monitoring, and alerts.

LLM Train

pre-training
post-training

LLM knowledge distillation

LLM Inference

GPU resource management
API / SDK encapsulation
rate limiting
error handling
logging
monitoring
alerts
notifications

LLM Fine-tune

prefix fine-tuning, prompt tuning, variants
SFT
RLHF / RLAIF / DPO variants
LoRA and QLoRA variants

LLM RAG

basic patterns

dense vector-based RAG
sparse vector-based RAG
graph-based RAG

SOP

ingest documents, chunking and embedding (structured data) with strategies
recall with hybrid search
format, references and citations
re-rank, query-rewriting, multi-hop, graph or table augmentation
composable and modular RAG system architecture
domain-specific retrieval pipelines; continuous ingestion
quality metrics, evals and quality dashboards

LLM Prompting Engineering

prompting engineering BP for human -> write the best prompts for your tasks
- classic patterns: one / few shots, chain-of-thought, self-consistency, reAct etc.
prompt management (version, testing, validation, safety, etc.)
AI driven prompting optimization (prompting refine by AI and auto.)
- DSPy, textGuard, promptWizard, GRAD-SUM, ell, StarGo ...

Agentic System Context Engineering

write context
- memories
- state
- scratch-pad
- ...
select context
- tools retrieval
- docs / knowledge retrieval
- memory retrieval
  - memo0 example for long-term memory management
- ...
compress context
- prompt compression (information compression)
- summary by LLM
- trim by rules
- ...
isolate context
- in state
- hold in environment / sandbox
- partition among agents
- ...

Make agent select tools to organize and mange its runtime context (CURD operation is maintained by agent itself)

LLM Select

pick and compose right LLMs for the task
- model family selection
  - open-source LLMs family
  - commercial LLMs family
- latency, cost, throughput, quality, etc.
LLM parameters (tokens, top-p, temperature, etc.)

LLM Agentic Systems

Core Patterns

reasoning: CoT · BDI (Belief, Desire, Intention) · ReAct
goal: passive goal creator · proactive goal creator
planning: single / multi-path plan generator · plan and execute framework · graph-based control flow
retrieval: RAG · knowledge and RAG enhancements
reflection: self-reflection and refinement · cross-reflection · human reflection
cooperation: voting / role / debate based · tool / agent registry
execution: serial vs. parallel tool execution · tool execution sandbox · agent evaluator · multi-modal guardrails
optimization: prompt / response optimizer

reference practical patterns:

12-factor-agents

Memory

short-term vs. long-term memory
- storage backends: vector store · graph DB · relational DB · file systems
- structure: graph-based vs. tree-based
A-MEM: Dynamic and Self-Evolving memory
context-sizing control

Tools & Skills

tool-call and skills management
- code execution · html / web-page generation · browser-use · VM use · web search
multi-step workflow
- traditional multi-step workflow
- agent skills (fixed patterns as sub-agents) — skills BP for engineering

Agentic Flow & Interface

agentic-flow prompting
- ReAct agent
- reflection × planning × action
- RPA loop: perception × reasoning × action
- Effective HITL (Human in the Loop)
user-interface customization
continuous learning loop (telemetry → evals → prompt / knowledge updates)

Reliability & Safety

human-in-loop (HITL) · basic principles for agent build
hallucination prevention and mitigation
safety, security, compliance, governance
- content filters · PII redaction · secure key management
- prompt injection defenses · retrieval hygiene · tool permissioning
- policy layers (allow/deny lists) · sensitive actions with human approval
- compliance: data retention · audit trails · red-team exercises

Performance & Cost

metrics: cost · latency · throughput · prompting logs · tool-call logs
token budgeting · caching · short prompts · prompt cache
reranking before generation · response compression · approximate search tuning
distillation / routing to small models · speculative decoding
SLAs with adaptive quality tiers · cost/perf dashboards

Multi-Agent Systems (MAS)

topology: centralized vs. decentralized · hierarchical vs. flat · serial vs. parallel · supervisor vs. peer
memory sharing: Blackboard Model · state-based vs. memory-based
- storage: vector store · graph DB · relational DB
communication protocol: end-to-end · broadcast · shared-memory channels
tool invocation protocol: MCP (Model Context Protocol)
human roles in the agentic loop: supervisor · loop participant · meta-agent

LLM Product Engineering

Classic Protocols

MCP (Model Context Protocol)
A2A (Agentic to Agentic Protocol) with ADK
A2UI Protocol widgets and components render from AI
Ag-UI (Agentic UI Protocol)
Agent to Editor (Client) Protocol

Frameworks

ai-sdk (node / javascript)
LangChain (python) / LangGraph (python)
AutoGPT
AgentOps
MetaGPT
CrewAI
...

Feature	CrewAI	LangGraph	AutoGen
Primary Approach	Role-based / Team structure	Graph-based / State machine	Conversation-based interaction
State Management	Central Orchestrator	Strong-typed Stateful Graphs	Contextual Memory Engine
Task Allocation	Bidding Mechanism / Role	Predefined Node Transitions	Iterative Agent Dialogue
Complexity Level	Intuitive / Low-to-Moderate	Advanced / High Control	Modular / Moderate-to-High
Best Use Case	Cross-functional projects	Supply chain / Data pipelines	Software development / Coding

Platforms

Model Services Vendors:

Open Router
Claude / Gemini / Grok / OpenAI / DeepSeek / ...

LLM Orchestration Platforms:

OpenAI Agent Builder
Dify / Coze
n8n
Gumloop (AgentHub)

Observation: Monitoring real-time agent actions, including tool usage and reasoning paths.

LangSmith

Test and Evaluation

Langfuse
PromptFoo

LLM Deep Scenarios

AI First product systems

VibeCoding

basic principles and manifesto

OpenSource research:

Gemini CLI
Cursor

Arno's BP for VibeCoding

Manus - General Agentic System

patterns:

monolithic
pipeline sub-systems
multi-agent sub-systems (MoA)
hybrid mixed

info resources:

domain-specific / public information retrieval

context:

memory management
context management / compress and optimize

plan strategies

static workflow
intent to plan
unified intent planning