Prompt Engineering with Anthropic Claude

Tips on how to prompt Claude more effectively. Take-aways from a talk by Anthropic’s “Prompt Doctor” (Zack Witten).

Jared Zoneraich

Jul 31, 2024 — 4 min read

The “Prompt Doctor” helping the audience debug prompts. Full talk here.

Tips from Anthropic’s “Prompt Doctor” at the AI Engineer 2024 conference

There are nuances to prompting different models. To become a great prompt engineer, you need to develop a feeling for GPT, Claude, Gemini, or whatever model you are using.

I recently attended the AI Engineer 2024 conference. Zack Witten, a senior prompt engineer at Anthropic, gave a workshop on prompt engineering with Claude. Below are some of the things I learned.

Tip #1: Use XML tags

Claude was trained with XML tags in the training data. So using XML tags like <example>, <document>, etc. to structure your prompts can help guide Claude's output.

If you have seen some of Claude’s prompts that leaked, you know that Anthropic heavily uses XML tags in their prompts.

For example, instead of doing this:

You are a helpful AI fortune teller. Predict how long a user will live. 
 
For example, if the user says they are old and fat... 
tell them they don't have much time left! 
 
Or if they user says they are healthy tell them they have lots of time.

You can format your prompt using XML tags like this:

You are a helpful AI fortune teller. Predict how long a user will live. 
 
<examples> 
 
<example> 
<user>old and fat</user> 
<answer>not much time left</answer> 
</example> 
 
<example> 
<user>healthy</user> 
<answer>lots of time</answer> 
</example> 
 
</examples>

We still might need some prompt tuning to get the style right (example of prompt above)…

Tip #2: Be specific rather than saying “be concise”

Instead of saying “be concise”, it’s better to give a specific sentence range like “Limit your response to 2–3 sentences”. This gives Claude clearer guidance.

Generally, you should approach prompt engineering like you are talking to a really dimwitted intern. Be specific. The more you leave ambiguous, the more room there is for the intern to screw up. Don’t rely on the AI having to assume what you mean.

This is not just true for telling the model to be concise, but true for literally all types of prompt engineering.

This is also why I highly recommend building modular prompts that do one thing and only one thing. Not just does this make it easier to unit test, it actually makes the prompts perform better.

Tip #3: Mimic the desired tone and style

Claude will often match the tone and style of the prompt. So if you want an academic tone, phrase your prompt in more formal, academic language. If you want a casual, conversational response, write your prompt that way.

Remember, all an LLM is doing at the core is guessing the next token. That means the LLM is just continuing the conversation.

Some examples:

If a conversation is verbose, the conversation will likely continue to be verbose.
If your prompt is in English, it will be harder to make the LLM respond reliably in Spanish.

Tip #4: Proofread your prompts

Cleaning up things like capitalization and grammar in your prompts can improve the quality of Claude’s outputs. Taking care with your prompt yields better results.

This just feels like good prompt hygiene. Half the challenge of prompt engineering is organization. The cleaner your prompts, the easier to maintain. Using a prompt CMS or management tool will help with this.

Tip #5: Write prompts in a language you know well

Prompts written in your native or strongest language tend to work best. Avoid using languages you aren’t very confident in, as errors or unnatural phrasing can throw off the model.

Everything is probabilistic in the world of LLMs. Unnatural language idioms will change your output.

Tip #6: Pre-fill the beginning of Claude’s response

You can guide Claude’s output by pre-filling the beginning of its response, a technique called “prefilling”. For example, if you want Claude to output JSON, you could start with:

For example, if I have the following prompt that converts a data blob into JSON:

Please convert the unstructured data into JSON using our schema. 
 
<data> 
{data} 
</data>

I can specify a prefill that forces Claude to respond with JSON. All JSON begins with { so let’s just force Claude to begin that way too:

Here is the JSON: 
<json> 
{

Using this, Claude is primed to complete the JSON. Anthropic has more details on prefilling in their docs.

To get even more advanced, you can use a stop-sequence of </json>.

Tip #7: Avoid negative prompting

Telling Claude too forcefully what not to do can sometimes backfire and actually encourage that behavior through a kind of reverse psychology effect. Use negative prompting sparingly and with a light touch.

This has a similar explanation to Tip #3: LLMs simply continue a conversation the same way it started.

Tip #8: Rely on human messages more than system messages

Claude follows instructions in the human messages (the actual user prompts) better than those in the system message. Use the system message mainly for high-level scene setting, and put most of your instructions in the human prompts.

System messages seem to be the most useful for role setting and tool definitions.

Overall Takeaways

One thing you should always remember: avoid LLMs if you can.

Avoid relying on the LLM for tasks that can be handled in code whenever possible, especially things like formatting or evaluation logic. As Zack put it: “If I don’t have to make this long pilgrimage to the Oracle [Claude] to ask them, I shouldn’t”. It’s often more efficient and reliable to handle those steps outside the LLM.

Another key point was that effective prompting is a blend of art and science. Tips and best practices help, but there’s no substitute for iterative experimentation and development of an intuition for what works. Prompt engineering has elements of both creative writing and systematic coding.

Other big take-aways:

Favoring shorter, concise sentences with common words
Thoroughly testing prompts with tools like spreadsheets of test cases
Responses mimic prompts

Anthropic is working hard to make Claude a powerful and steerable AI assistant, and evolving the art and science of prompt engineering along the way. If you found this helpful, follow Zack and check out the full talk!

PromptLayer is a CMS for prompt engineering. Teams use it to collaboratively build AI applications with domain knowledge.

Visually iterate on prompts, A/B test versions, debug historical logs, view analytics, build datasets, and run evaluations.

Made in NYC 🗽 Sign up for free at www.promptlayer.com 🍰