💠 AI First Product Engineer Wiki

February 7, 2026 (1mo ago)

try to craft my own wiki of AI era.

Theory and Foundation Layer

  • math fundamentals
  • CS and programming fundamentals
  • AI fundamentals
  • LLM fundamentals

Basic Theory and Patterns Evolution

StageTechnical FocusCapability ShiftCore Architectural Constraint
Stage 1Transformers / MoELarge-scale language processingLack of intent alignment or reasoning
Stage 2Instruction Fine-TuningImproved alignment with user goalsBrittle across diverse or novel tasks
Stage 3RLHFHuman-centric value alignmentHighly dependent on human evaluation
Stage 4Tool IntegrationActive capability via external APIsLack of autonomous planning/memory
Stage 5RAGReal-time factual groundingStatic knowledge and grounding issues
Stage 6Single-Agent AutonomyAutonomous planning and executionLimited to sequential, linear problem-solving
Stage 7Multi-Agent CollaborationDistributed, specialized orchestrationHigh coordination and state complexity
Stage 8Persistent Expert AgentsLong-term learning and domain expertiseOngoing research into self-evolving memory

LLM Ops

  • MLOps: Abstract out the common computing/storage layer, taking care of capacity, scheduling, scaling, and load balancing.
    • compute layer: GPU cluster setup and management to fully utilize the hardware
    • Scaling: Automatic and seamless scaling up and down, from on-premise to cloud when needed
    • Scaling: Support multiple models with dynamic model loading
    • Operations: Monitor usage of computing resources, and status of training and inference jobs
    • Operations: Generate data for usage stats and metrics dashboard, and alert when anomaly detected
    • Platform: Training / Fine-tuning: improve training throughput, reliability and efficiency
    • Platform: Inference: Leverage the latest and most efficient open source framework for LLM inference to reduce latency and improve throughput
    • Platform: Evaluation and Benchmarking: automatically evaluate models' performance on datasets of interests
    • Platform: A/B Testing: capability for online A/B testing to compare features
  • Unified AI Gateway: Abstract out the common API/SDK layer, taking care of authentication, authorization, rate limiting, error handling, logging, monitoring, and alerts.

LLM Train

  • pre-training
  • post-training

  • LLM knowledge distillation

LLM Inference

  • GPU resource management
  • API / SDK encapsulation
  • rate limiting
  • error handling
  • logging
  • monitoring
  • alerts
  • notifications

LLM Fine-tune

  • prefix fine-tuning, prompt tuning, variants
  • SFT
  • RLHF / RLAIF / DPO variants
  • LoRA and QLoRA variants

LLM RAG

basic patterns

  • dense vector-based RAG
  • sparse vector-based RAG
  • graph-based RAG

SOP

  • ingest documents, chunking and embedding (structured data) with strategies
  • recall with hybrid search
  • format, references and citations
  • re-rank, query-rewriting, multi-hop, graph or table augmentation
  • composable and modular RAG system architecture
  • domain-specific retrieval pipelines; continuous ingestion
  • quality metrics, evals and quality dashboards

LLM Prompting Engineering

  • prompting engineering BP for human -> write the best prompts for your tasks
    • classic patterns: one / few shots, chain-of-thought, self-consistency, reAct etc.
  • prompt management (version, testing, validation, safety, etc.)
  • AI driven prompting optimization (prompting refine by AI and auto.)
    • DSPy, textGuard, promptWizard, GRAD-SUM, ell, StarGo ...

Agentic System Context Engineering

  • write context
    • memories
    • state
    • scratch-pad
    • ...
  • select context
    • tools retrieval
    • docs / knowledge retrieval
    • memory retrieval
      • memo0 example for long-term memory management
    • ...
  • compress context
    • prompt compression (information compression)
    • summary by LLM
    • trim by rules
    • ...
  • isolate context
    • in state
    • hold in environment / sandbox
    • partition among agents
    • ...

Make agent select tools to organize and mange its runtime context (CURD operation is maintained by agent itself)

LLM Select

  • pick and compose right LLMs for the task
    • model family selection
      • open-source LLMs family
      • commercial LLMs family
    • latency, cost, throughput, quality, etc.
  • LLM parameters (tokens, top-p, temperature, etc.)

LLM Agentic Systems

Core Patterns

  • reasoning: CoT · BDI (Belief, Desire, Intention) · ReAct
  • goal: passive goal creator · proactive goal creator
  • planning: single / multi-path plan generator · plan and execute framework · graph-based control flow
  • retrieval: RAG · knowledge and RAG enhancements
  • reflection: self-reflection and refinement · cross-reflection · human reflection
  • cooperation: voting / role / debate based · tool / agent registry
  • execution: serial vs. parallel tool execution · tool execution sandbox · agent evaluator · multi-modal guardrails
  • optimization: prompt / response optimizer

reference practical patterns:

Memory

  • short-term vs. long-term memory
    • storage backends: vector store · graph DB · relational DB · file systems
    • structure: graph-based vs. tree-based
  • A-MEM: Dynamic and Self-Evolving memory
  • context-sizing control

Tools & Skills

  • tool-call and skills management
    • code execution · html / web-page generation · browser-use · VM use · web search
  • multi-step workflow

Agentic Flow & Interface

  • agentic-flow prompting
    • ReAct agent
    • reflection × planning × action
    • RPA loop: perception × reasoning × action
    • Effective HITL (Human in the Loop)
  • user-interface customization
  • continuous learning loop (telemetry → evals → prompt / knowledge updates)

Reliability & Safety

  • human-in-loop (HITL) · basic principles for agent build
  • hallucination prevention and mitigation
  • safety, security, compliance, governance
    • content filters · PII redaction · secure key management
    • prompt injection defenses · retrieval hygiene · tool permissioning
    • policy layers (allow/deny lists) · sensitive actions with human approval
    • compliance: data retention · audit trails · red-team exercises

Performance & Cost

  • metrics: cost · latency · throughput · prompting logs · tool-call logs
  • token budgeting · caching · short prompts · prompt cache
  • reranking before generation · response compression · approximate search tuning
  • distillation / routing to small models · speculative decoding
  • SLAs with adaptive quality tiers · cost/perf dashboards

Multi-Agent Systems (MAS)

  • topology: centralized vs. decentralized · hierarchical vs. flat · serial vs. parallel · supervisor vs. peer
  • memory sharing: Blackboard Model · state-based vs. memory-based
    • storage: vector store · graph DB · relational DB
  • communication protocol: end-to-end · broadcast · shared-memory channels
  • tool invocation protocol: MCP (Model Context Protocol)
  • human roles in the agentic loop: supervisor · loop participant · meta-agent

LLM Product Engineering

Classic Protocols

  • MCP (Model Context Protocol)
  • A2A (Agentic to Agentic Protocol) with ADK
  • A2UI Protocol widgets and components render from AI
  • Ag-UI (Agentic UI Protocol)
  • Agent to Editor (Client) Protocol

Frameworks

  • ai-sdk (node / javascript)
  • LangChain (python) / LangGraph (python)
  • AutoGPT
  • AgentOps
  • MetaGPT
  • CrewAI
  • ...
FeatureCrewAILangGraphAutoGen
Primary ApproachRole-based / Team structureGraph-based / State machineConversation-based interaction
State ManagementCentral OrchestratorStrong-typed Stateful GraphsContextual Memory Engine
Task AllocationBidding Mechanism / RolePredefined Node TransitionsIterative Agent Dialogue
Complexity LevelIntuitive / Low-to-ModerateAdvanced / High ControlModular / Moderate-to-High
Best Use CaseCross-functional projectsSupply chain / Data pipelinesSoftware development / Coding

Platforms

Model Services Vendors:

  • Open Router
  • Claude / Gemini / Grok / OpenAI / DeepSeek / ...

LLM Orchestration Platforms:

  • OpenAI Agent Builder
  • Dify / Coze
  • n8n
  • Gumloop (AgentHub)

Observation: Monitoring real-time agent actions, including tool usage and reasoning paths.

  • LangSmith

Test and Evaluation

  • Langfuse
  • PromptFoo

LLM Deep Scenarios

AI First product systems

VibeCoding

  • basic principles and manifesto

OpenSource research:

  • Gemini CLI
  • Cursor

Arno's BP for VibeCoding

Manus - General Agentic System

patterns:

  • monolithic
  • pipeline sub-systems
  • multi-agent sub-systems (MoA)
  • hybrid mixed

info resources:

  • domain-specific / public information retrieval

context:

  • memory management
  • context management / compress and optimize

plan strategies

  • static workflow
  • intent to plan
  • unified intent planning

OpenSource research:

  • OpenManus

DeepResearch

  • OpenResearch

NoteBook

  • Google Notebook ML

MultiModal

  • Gen Image
  • Gen Video
  • Gen Audio
  • Gen 3D objects

Reference


trace

  • (26-01-04) add more products and frameworks to the wiki
  • (26-02-07) add more details about LLM Ops and Infra.
  • (26-03-17) provide clean structure and content for the agentic section.

Arno Crafting Apps

ELABORATION STUDIO 🦄

Elaborate your ideas and solve your problems with AI in fully boosted context way ~