Nine Principles After Reviewing the Hottest Prompt Engineering Practices

Prompt Engineering

If you treat prompts as a writing contest, you’ll get answers that occasionally work.

If you treat prompts as efficiency engineering, you’ll get stable, reusable deliverables.

This is the core conclusion I’ve drawn after reviewing the four most popular prompt engineering tutorials online.

For many people now, the most time-consuming action isn’t “not knowing how to use the model,” but writing prompts from scratch every time—getting longer and longer, with more and more rework—finally spending costs on retries instead of output.

So this article does one thing only: teaches you how to write prompts efficiently, so that first-shot accuracy increases, revision cycles decrease, and template reusability improves.

1. Define Efficiency First: Not About Writing More, But Writing Executable

Efficient prompts can be understood with a highly engineering-focused formula:

Efficiency = First-shot Hit Rate × Reusability Rate ÷ Revision Cycles

The common ground among the four whitepapers in the appendix all converge on this formula—the details vary, but the underlying logic is consistent: complex tasks should be broken down, input information should be compartmentalized, action language should be explicit, examples should provide bidirectional constraints, and parameters should be configured according to task fault tolerance.

Each principle below directly corresponds to one of the variables in this formula: some improve hit rate, some reduce revision cycles, and some enable templates to be used repeatedly.

If your prompt can’t be directly reused by colleagues, can’t continue being used for similar tasks next week, and can’t reduce revisions from 4 cycles to 1-2 cycles, then it’s not an efficient prompt—it’s just a one-time conversation.

2. From Mega Prompt to Pipeline: The First Step in Efficiency Improvement

Inefficient writing dumps intent recognition, data extraction, structure organization, and style control into the model all at once. Every step the model takes can introduce noise, and a deviation at one point pollutes all subsequent output.

Efficient writing splits the task into a 3-stage pipeline:

Stage 1: Establish task context, introduce background information.
Stage 2: Define task execution actions, handle only logical sequence.
Stage 3: Define task output format and related boundary constraints.

The value of this approach is that it’s localizable, rollbackable, and optimizable: when something is wrong, only the current node needs fixing—no need to rewrite the entire segment. This is usually where the largest efficiency gap occurs.

3. Use XML Tags for Boundary Management

Many articles discuss prompts without actually writing out XML tags—it’s like explaining “structured isolation” without providing executable structure.

Below is a reference XML skeleton for isolating rules, context, examples, and output format:

<prompt>
  <system_role>You are a senior AI editor aiming to produce high-quality English column content ready for publishing</system_role>
  <goal>Organize input materials into a structured, in-depth article</goal>
  <instructions>
    1. First extract facts, then organize structure, finally generate body text
    2. Do not fabricate data, do not omit key conclusions
    3. Strictly follow output format
  </instructions>
  <context>
    Paste raw materials, interview records, meeting notes here
    Note: Context should be pruned—not the more the better. Irrelevant information dilutes the weight of key content; only include paragraphs directly relevant to the task.
  </context>
  <examples>
    <good>Example: clear conclusions + corresponding evidence + complete structure</good>
    <bad>Example: empty opening + no evidence + jumping structure</bad>
  </examples>
  <constraints>
    Word count 1800-2200; provide one sentence of explanation when a term first appears; do not use marketing tone
  </constraints>
  <output_format>
    Title
    Opening (problem definition)
    Body (3-5 sections of logical progression)
    Closing (actionable recommendations)
  </output_format>
</prompt>

Each XML tag here represents a semantic unit. Separating different parts of the prompt through semantics makes it easier for AI to understand as well.

The core of this structure is separating “rules that must be followed” from “reference materials,” avoiding context pollution and significantly reducing deviation caused by prompt injection.

4. Efficient Ordering for Long Context: Materials First, Question Last

Among the four whitepapers, there’s an easily overlooked but practical consensus: for long-text tasks, placing materials first and questions last typically works more stably than “questions first.”

The reason isn’t mystical—it’s determined by the model’s generation mechanism. Models generate left-to-right, word by word, and are more sensitive to information at the tail. Placing the question at the tail means converging the model’s attention at the last moment—at this point, the “distance” between the question and generated content is shortest, effectively alleviating the forgetting problem of middle content.

So when doing document Q&A, codebase analysis, or long report summaries, don’t just think about “adding more prompts.” Getting the input order right first often yields greater returns.

5. Precise Action Language: Write Prompts with “Verb + Object + Constraint”

Common expressions in inefficient prompts are “take a look,” “optimize a bit,” “write more professionally”—these words have no execution boundaries, so the model can only guess.

Efficient writing unifies into one sentence pattern:

Verb: extract, compare, summarize, rewrite, deduce.
Object: viewpoints, data, paragraphs, solutions, code.
Constraint: word count, format, evidence, tone, prohibitions.

Example:

Compare the differences between Plan A and Plan B in terms of cost, risk, and benefit—provide 2 pieces of evidence for each, output as a three-column table, with conclusions not exceeding 100 words.

You’ll find these types of prompts have no fancy rhetoric, but significantly higher execution hit rates. That’s true efficiency.

6. Examples Should Be Bidirectional: Give Both Good/Bad, and Revisions Will Clearly Decrease

When a task has strict format or quality requirements, giving only one correct example isn’t enough, because the model only knows “what the best looks like,” not “which paths to avoid.”

The efficient approach is to give 3-5 sets of bidirectional examples:

Good: Standard output demonstration.
Bad: Common erroneous output.
Why Bad: Explanation of why the error occurred.

Two points to note:

First, don’t exceed 5 sets of examples—too many will cause the model to overfit the example format, ignoring your actual instructions.

Second, this method yields the most obvious benefits in high-constraint tasks like factual extraction, analytical reports, and code refactoring—almost the most time-saving fuse.

7. Positive Instructions First, Negative Rules as Supplement

When you tell the model “don’t be wordy,” the model still has to guess what “wordy” means. When you tell it “each point should not exceed 30 characters, keeping only conclusions and evidence,” it can execute directly.

So the order for efficient prompts should be: first write what to do (positive actions), then write what not to do (negative boundaries).

This isn’t a matter of wording preference but an engineering approach that aligns with the model’s forward generation logic. The model establishing a positive framework first and then setting restrictions is easier to execute than starting with “don’t.”

8. Parameters Are Not Mystical: Configure Temperature/Top-P Based on Task Fault Tolerance

All large models allow temperature settings when accessed. If you directly use products like Claude Code, you may not have encountered this section and can skip it.

The consensus among the four whitepapers on parameters is clear: parameters should serve the task, not be maxed out by feel. Different temperatures bring different effects—here’s a practical quick reference:

Task Type	Temperature	Top-P
Fact-checking, code generation, mathematical reasoning	Close to 0	Tight
Regular summaries, business writing, process texts	0.2 - 0.5	Moderate constraint
Creative writing, brainstorming, marketing copy	0.7 - 0.9	Relaxed

When rigor is required, never increase temperature—otherwise revisions will definitely increase. This is a pitfall many have fallen into.

9. Make Prompts into Assets: From “Can Write” to “Scalable”

The last commonly overlooked efficiency point is assetization.

Don’t scatter prompts across chat records—instead, build template libraries organized by scenario, such as information processing templates, decision analysis templates, and content production templates. Each template has a fixed skeleton, with only variables being replaced.

Going further, templates should undergo version control like code: record the reason for each modification and comparison of effects. Otherwise, a “good version” iterated out can quietly become invalid due to a casual change, and you won’t even know from which iteration it started degrading.

Charlie Munger once said: As long as you deeply learn and can apply 80-90 important models, you possess 90% of the cognitive framework for understanding world affairs.

Applying this to AI prompt assetization: as long as you iterate 30-50 prompts you actually need, you can solve over 90% of repetitive work.

A 10-Minute Self-Check List You Can Execute Immediately

Quick review before publishing:

Is the goal singular, or are three things stuffed into one request
Are XML or clear blocks used, rather than everything mixed together
Are verbs executable, with objects and constraints attached
Are there Good/Bad bidirectional examples (no more than 5 sets)
Are parameter configurations matched to task fault tolerance
Can this prompt be directly reused next week

The essence of efficient prompt writing is not about writing more, but about making task expression into a system that is executable, verifiable, and reusable.