Offset: 0.0s
Space Play/Pause

"okay, but I want Gemini3 to perform 10x for my specific use case" - Here is how

Google’s release of Gemini 3 has shattered expectations, particularly regarding its coding and frontend capabilities. It outperforms almost everything seen before. However, many users fail to r…

6 min read

[HubSpot Claude Conector Prompt Library direct link] (https://offers.hubspot.com/thank-you/claude-connector-prompt-library)

Unlocking the True Potential of Gemini 3 and Claude: A Masterclass in Prompt Engineering

Google’s release of Gemini 3 has shattered expectations, particularly regarding its coding and frontend capabilities. It outperforms almost everything seen before. However, many users fail to realize that Gemini 3 is fundamentally a reasoning model, and this distinction requires a completely different approach to prompting compared to older LLMs.

[0:11.800] [Gemini API Docs] Note: Placeholder for the visual at 0:11.800 displaying the Gemini API documentation highlighting reasoning models.

According to Google’s official documentation, because Gemini 3 operates as a reasoning model, the way we structure prompts must evolve. A critical component of this architecture is generated reasoning tokens. Unlike previous models where users had to “force-feed” context and logic into a densely packed prompt, reasoning models can get confused by too much noise. In fact, providing overly complex prompts to Gemini 3 often degrades performance.

[0:43.000] [Prompting Best Practices] Note: Placeholder for the visual at 0:43.000 showing Google’s ‘Prompting best practices’ documentation.

The best practice for these next-generation models is to provide precise instructions. If a prompt is overly complex, the model may over-analyze or become limited by the specific process outlined in the prompt. The goal is to be concise. However, while the model needs brevity, it is also extremely sensitive and steerable.

[0:56.200] [Concise Prompt List] Note: Placeholder for the visual at 0:56.200 listing ‘Needs concise prompt’ and ‘Extremely sensitive & steerable’.

This sensitivity means that a single keyword can drastically alter the output. For example, a standard request to “build a hello world page” might return a generic design. However, adding a specific stylistic instruction changes the entire output vector.

[1:13.600] [Linear Style Prompt] Note: Placeholder for the visual at 1:13.600 showing the prompt ‘help me build a hello world page with Linear style’.

help me build a hello world page with Linear style

By simply attaching an image reference or a style keyword, the UI details and quality improve dramatically. This raises the question: How do we consistently craft the “right” prompt to maximize the capabilities of models like Gemini 3?

The “Skill” Approach to Prompting

Anthropic recently released a blog post titled “Improving frontend design through Skills,” introducing a method to elevate models like Claude Sonnet 3.5 to produce designs rivaling Gemini 3. The core finding is that significant improvements are driven purely by well-crafted prompts that balance conciseness with useful details.

[1:35.000] [Anthropic Blog Post] Note: Placeholder for the visual at 1:35.000 showing the ‘Improving frontend design through Skills’ article.

Anthropic uncovered a systematic process for getting the most out of cloud models through context engineering. This method applies across various models, including Gemini. To understand this, we can break their approach down into a three-step process.

[2:13.200] [Three Step Process] Note: Placeholder for the visual at 2:13.200 listing 1. Identify convergent defaults, 2. Provide concrete alternatives, 3. Structure guidance at the right altitude.

This methodology isn’t just for design; it applies to sales, marketing, and business operations. For example, HubSpot has utilized this “Cloud Skill” concept to build a library of highly tested prompts that leverage real CRM context rather than generic inputs.

[2:40.800] [HubSpot Prompt Library] Note: Placeholder for the visual at 2:40.800 showing the HubSpot ‘Smarter Claude Prompts’ library.

Understanding Distributional Convergence

The most interesting concept behind model behavior is distributional convergence. During the sampling process, models predict tokens based on statistical patterns found in their training data.

[3:28.000] [Distributional Convergence Text] Note: Placeholder for the visual at 3:28.000 highlighting the definition of Distributional Convergence.

“Safe design choices—those that work universally and offend no one—dominate web training data.”

By default, models revert to these “safe” choices. To break this cycle, you must identify these convergent defaults (the boring behavior you don’t like) and provide concrete alternatives. Crucially, you must structure this guidance at the right altitude.

[4:35.000] [Too Specific vs Just Right] Note: Placeholder for the visual at 4:35.000 showing a chart comparing ‘Too specific’ vs ‘Just right’ prompting levels.

If a prompt is too specific, listing every single step (1, 2, 3, 4, 5), the system overfits to a narrow scenario and fails in real-world “long-tail” use cases. The key is to find the “Goldilocks” zone—guidance that directs behavior without micromanaging every token.

A Practical Example: Excalidraw Wireframes

Let’s apply this three-step process to a real task: teaching a small model to generate high-quality Excalidraw JSON wireframes.

[4:54.000] [Identify, Root Cause, Structure] Note: Placeholder for the visual at 4:54.000 showing the refined process steps.

Step 1: Identify Convergent Defaults

First, test the model with a bare-minimum prompt to see where it fails.

[7:53.000] [Basic Prompt Test] Note: Placeholder for the visual at 7:53.000 showing the basic prompt: ‘You are a professional UX engineer…’.

In this test case, the model fails to output correct JSON for Excalidraw. It hallucinates element types that don’t exist (like “circle” instead of “ellipse”) and messes up layout alignment.

Step 2: Find the Root Cause

Instead of guessing, we can ask the model to debug itself. By feeding the erroneous JSON into a stronger model (or asking the current model) to identify issues, we uncover specific technical misunderstandings.

[8:42.000] [Debugging Analysis] Note: Placeholder for the visual at 8:42.000 showing ChatGPT analyzing the JSON errors.

A major discovery here was the intrinsic width issue. The model was setting width: 0 for text elements, assuming they would auto-expand based on content. However, Excalidraw requires explicit bounds or specific alignment hacks.

[9:35.000] [Root Cause Analysis] Note: Placeholder for the visual at 9:35.000 showing the analysis: ‘The Album Art text will likely not render properly’.

To drill deeper, you can use a Debug Prompt:

DEBUG MODE: Dont generate again, just help me understand - Why did you set width to be 0 for type: text?

Step 3: Structure Guidance at the Right Altitude

The common mistake is to simply list forbidden properties. A better approach is to explain the reasoning or the logic the model should follow.

[11:11.000] [Structued Guidance] Note: Placeholder for the visual at 11:11.000 showing specific rules for text, rectangle, and line types.

Instead of vague instructions, we provide architectural rules:

  • For Text: Width/Height must match the container size, and use textAlign to control position.
  • For Lines: Use points arrays, not generic width/height boxes.

We also tell the model what not to include to keep the payload clean:

"ONLY output properties impact styling, NEVER output things like seed, version, versionNonce, isDeleted, boundElementIds, etc."

Conclusion

By iterating through this loop—identifying defaults, finding root causes, and structuring guidance at the conceptual level—you can create System Prompts that force even smaller models to punch way above their weight class.

[11:45.000] [Google AI Studio Prompt] Note: Placeholder for the visual at 11:45.000 showing the final system instructions in Google AI Studio.

This is the exact methodology used to power tools like Superdesign, creating AI agents capable of generating fashion landing pages, complex music recording UIs, and fully functional wireframes with Gemini 3 levels of creativity and precision.