MoreRSS

site iconTomasz TunguzModify

I’m a venture capitalist since 2008. I was a PM on the Ads team at Google and worked at Appian before.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Tomasz Tunguz

Adding Complexity Reduced My AI Cost by 41%

2025-09-30 08:00:00

I discovered I was designing my AI tools backwards.

Here’s an example. This was my newsletter processing chain : reading emails, calling a newsletter processor, extracting companies, & then adding them to the CRM. This involved four different steps, costing $3.69 for every thousand newsletters processed.

Before: Newsletter Processing Chain

# Step 1: Find newsletters (separate tool)
ruby read_email.rb --from "[email protected]" --limit 5
# Output: 340 tokens of detailed email data

# Step 2: Process each newsletter (separate tool)
ruby enhanced_newsletter_processor.rb
# Output: 420 tokens per newsletter summary

# Step 3: Extract companies (separate tool)
ruby enhanced_company_extractor.rb --input newsletter_summary.txt
# Output: 280 tokens of company data

# Step 4: Add to CRM (separate tool)
ruby validate_and_add_company.rb startup.com
# Output: 190 tokens of validation results

# Total: 1,230 tokens, 4 separate tool calls, no safety checks
# Cost: $3.69 per 1,000 newsletter processing workflows

Then I created a unified newsletter tool which combined everything using the Google Agent Development Kit, Google’s framework for building production grade AI agent tools :

# Single consolidated operation
ruby unified_newsletter_tool.rb --action process \
  --source "techcrunch" --format concise \
  --auto-extract-companies
# Output: 85 tokens with all operations completed

# 93% token reduction, built-in safety, cached results
# Cost: $0.26 per 1,000 newsletter processing workflows
# Savings: $3.43 per 1,000 workflows (93% cost reduction)

Why is the unified newsletter tool more complicated?

It includes multiple actions in a single interface (process, search, extract, validate), implements state management that tracks usage patterns & caches results, has rate limiting built in, & produces structured JSON outputs with metadata instead of plain text.

But here’s the counterintuitive part : despite being more complex internally, the unified tool is simpler for the LLM to use because it provides consistent, structured outputs that are easier to parse, even though those outputs are longer.

To understand the impact, we ran tests of 30 iterations per test scenario. The results show the impact of the new architecture :

Metric Before After Improvement
LLM Tokens per Op 112.4 66.1 41.2% reduction
Cost per 1K Ops $1.642 $0.957 41.7% savings
Success Rate 87% 94% 8% improvement
Tools per Workflow 3-5 1 70% reduction
Cache Hit Rate 0% 30% Performance boost
Error Recovery Manual Automatic Better UX

We were able to reduce tokens by 41% (p=0.01, statistically significant), which translated linearly into cost savings. The success rate improved by 8% (p=0.03), & we were able to hit the cache 30% of the time, which is another cost savings.

While individual tools produced shorter, “cleaner” responses, they forced the LLM to work harder parsing inconsistent formats. Structured, comprehensive outputs from unified tools enabled more efficient LLM processing, despite being longer.

My workflow relied on dozens of specialized Ruby tools for email, research, & task management. Each tool had its own interface, error handling, & output format. By rolling them up into meta tools, the ultimate performance is better, & there’s tremendous cost savings. You can find the complete architecture on GitHub.

From A to B Without Inventing Letters

2025-09-29 08:00:00

“The way to do a piece of writing is three or four times over, never once.”

Writing is hard. John McPhee, who invented literary nonfiction that reads like a novel, developed a four-draft writing method that transforms chaotic ideas into compelling narratives.

McPhee pioneered creative nonfiction at The New Yorker, writing books like Oranges & Coming into the Country that made complex subjects fascinating through storytelling. His approach differs from traditional journalism by incorporating fiction techniques while maintaining factual accuracy. His prose combines vivid imagery with economy :

“The doctor listens in with a stethoscope and hears sounds of a warpath Indian drum.”

He favored directness :

“He liked to go from A to B without inventing letters between.”

About his genre, McPhee said :

“Nonfiction—what the hell, that just says, this is nongrapefruit we’re having this morning.”

McPhee later codified his approach in Draft No. 4: On the Writing Process, sharing decades of writing wisdom.

His organizational philosophy shapes everything :

“You can build a structure in such a way that it causes people to want to keep turning pages. A compelling structure in nonfiction can have an attracting effect analogous to a story line in fiction. Readers are not supposed to notice the structure. It is meant to be about as visible as someone’s bones.”

McPhee’s Four-Draft Framework :

  1. Brain dump draft - Capture every possible idea, fact, & angle without editing or judgment
  2. Structure draft - Organize ideas into logical sequences & identify the core narrative thread
  3. Ruthless cut draft - Remove everything that doesn’t serve the primary message or confuse the reader
  4. Polish draft - Refine prose, fix grammar, & ensure each sentence drives toward your goal

This is one of the best techniques I’ve found for writing. The method works because it separates creative thinking from critical evaluation. When you try to write perfect prose while generating ideas, it’s easy to fall into creative block.

Each draft becomes the foundation for the next, creating a recursive process that transforms chaotic thoughts into structured narratives. Like peeling back the layers of an orange to reveal the fruit within, each draft strips away what doesn’t belong, revealing the essential story that was always there waiting to be discovered.

The Efficient Frontier of AI

2025-09-26 08:00:00

Every portfolio manager knows the efficient frontier - the set of optimal portfolios offering maximum returns for given risk levels. What if AI prompts had their own efficient frontier?

As we all start to use AI, prompt optimization will be a consistent challenge. GEPA, GEnerative PAreto, is a technique to discover the equivalent efficient frontier for AI.

Reading the paper, I noticed the initial results were promising, with a 10-point improvement on certain benchmarks & a 9.2 times shorter prompt length. Shorter prompt length, & we all know that input prompts are the biggest driver of cost (see The Hungry, Hungry AI Model). So, I implemented GEPA in EvoBlog.

To use GEPA, we must identify the scoring axes that an LLM uses to score a post. Here are mine :

Evaluation Axis Weight Description
Style Match 25% How well the post matches Tom Tunguz’s distinctive writing style
Argument Quality 20% Strength and logic of the arguments presented
Data Usage 15% Effective use of statistics, examples, and quantified metrics
Readability 15% Clarity, sentence structure, and ease of reading
Originality 15% Fresh perspectives, novel connections, avoiding clichés
Engagement 10% Hooks, emotional language, reader involvement

Now that we have this framework, we can enter a prompt to generate a blog post & have the EvoBlog system iterate through different prompts to meet the efficient frontier for each dimension, weighted across all variables—not just one.

gepa_pareto_frontier

Here are the scores for two hypothetical blog posts. You can see one spikes more on style, while the other one focuses on data usage. Using GEPA, we can determine which is the better all-around post. In this case, it is the data-focused post.

All of this to say, dear reader, that I’ve only ever published one blog post fully generated by AI.

My goal with these automated systems is to learn how they work, how to tune them, & generate initial drafts that approximate my first & second drafts. I will always be completing drafts three & four.

The efficient frontier is no substitute for insight & an authentic voice.

The Math of Hypergrowth: Two Paths to the Same Goal

2025-09-25 08:00:00

How long & how quickly can a business compound ?

This is a question every investor asks of every business, public or private.

In the 2010s, Slack & Atlassian became titans. On the day Salesforce announced its intent to acquire Slack, it was equally valuable to Atlassian at ~$27b.

slack_vs_atlassian_market_cap

The revenue curves look similar in the out years, similar growth rates. Atlassian continues to compound at massive scale.

slack_vs_atlassian_revenue
But the time to achieve $1b from founding date differs by a decade : 17 vs 7 years.
years_to_billion_revenue

To create value, a startup must grow quickly & grow at scale ; or grow consistently over a long period of time. AI companies today are growing very quickly. The T3D2 companies can grow at a slower rate over a longer period of time to achieve the same market cap.

growth_rate_at_1b_revenue

Compare OpenAI’s 400% growth at $1b revenue to Atlassian’s 30%. Or Snowflake at 124%. Snowflake is $75b market cap today, Atlassian $42b. The advantage of a head of steam is clear.

current_market_cap_comparison

While both paths steady compounding & hypergrowth can lead to the same destination, the latter creates more value because of the time value of money. The sooner a startup reaches $1b revenue, the more valuable it is.

Of course, a hypergrowth company with significant churn isn’t worth very much at all. The CAP theorem equivalent in business is some combination of growth, margin, & retention. Most businesses can’t optimize for all three.

Is T3D2 Dead? Has AI Killed It?

2025-09-23 08:00:00

OpenAI hit $12 billion ARR within five years of ChatGPT’s launch [1] . Anthropic reached $200 million in revenue in January 2024 [2] . Meanwhile, Salesforce took ten years to reach $1 billion ARR [3] .

Does this mean the T3D2 framework (triple-triple-triple-double-double ARR to go public), originally outlined by Neeraj Agrawal, which provides a clear path to IPO-scale revenue is dead?

t3d2_growth_chart

There’s no doubt that AI companies have grown at unprecedented rates. If we understand these fundamental drivers, we can better assess how sustainable this growth is.

  1. Management teams & boards are insisting on AI transformation. ROI is still early. This urgency is captured by Larry Page’s recent quote: “I am willing to go bankrupt rather than lose this race.” The same mentality drives aggressive experimentation across multiple vendors & rapid buying decisions, from hyperscalers to mid-market businesses. Will the end of this era lead to churn?

  2. AI can automate labor. The overall cost savings can be significantly greater than workflow optimization tools of the previous era. As a result, the contract sizes for many AI products are significantly larger to start. If this continues, the overall bookings model & AE productivity model also need to change. If AI can deliver on the promise, larger contract sizes may be the norm.

  3. Incumbents are defensive. The prizes in AI are huge. The initial curiosity around AI & the massive growth rates of some companies has led some incumbents to turn defensive, blocking access to their data. More defensibility might also curtail on-platform growth.

The more sustainable AI’s growth rates & customer retention are, the more challenging T3D2 advice remains because the market’s expectations of growth will change. But for now, it’s too soon to tell whether the sizes of the contracts and durability are long-term characteristics of this market.

For businesses currently on the T3D2 plan, the fundraising market may be a bit more challenging because of the comparison to AI growth rates. Also the expectations around size at IPO have increased.

$100m in trailing revenue growing 50% used to be the target. But now the expectation is closer to $300m growing at 50%. Attaining those numbers requires sustaining high growth rates for longer.

References

  1. The Information. "OpenAI Hits $12 Billion in Annualized Revenue."July 2025. https://www.theinformation.com/articles/openai-hits-12-billion-annualized-revenue-breaks-700-million-chatgpt-weekly-active-users
  2. Latka. "How Anthropic hit $200M revenue and 100K customers in 2024."2024. https://getlatka.com/companies/anthropic
  3. Nira. "How Salesforce Built a $13 Billion Empire from a CRM."2024. https://nira.com/salesforce-history/
  4. The Information. "OpenAI Hits $12 Billion in Annualized Revenue."July 2025. https://www.theinformation.com/articles/openai-hits-12-billion-annualized-revenue-breaks-700-million-chatgpt-weekly-active-users
  5. Latka. "How Anthropic hit $200M revenue and 100K customers in 2024."2024. https://getlatka.com/companies/anthropic
  6. Nira. "How Salesforce Built a $13 Billion Empire from a CRM."2024. https://nira.com/salesforce-history/

The Post-AI Org Chart

2025-09-19 08:00:00

We build teams in pyramids today. One leader, several managers, many individual contributors.

In the AI world, what team configuration makes the most sense? Here are some alternatives :

Screenshot 2025-09-19 at 12.13.10 PM

First, the short pyramid. Managers become agent managers. The work performed by individual contributors of yore becomes the workloads of agents. Everyone moves up a level of abstraction in work.

Screenshot 2025-09-19 at 12.13.04 PM

This configuration reduces headcount by 85% (1:7:49 -> 1:7). The manager to individual contributor ratio goes from 1:7 to 1:1. The manager to agent ratio remains 1:7.

Second, the rocket ship 🚀!

One director, seven managers, 21 employees. Everyone in the organization is managing agents, but these agents reflect their seniority. The director manages an AI chief-of-staff, the managers are player-coaches, both executing goals themselves & training/coaching others on how to manipulate AI successfully, which cuts the span of control by half.

Screenshot 2025-09-19 at 12.20.35 PM

This configuration reduces headcount (1:7:49 -> 1:7:14) by 53%.

The future is not one-size-fits-all.

Here’s the twist: not every department in a company will adopt the same organizational structure. AI’s impact varies dramatically by function, creating a world where the shape of a company becomes more nuanced than ever.

Sales teams will likely maintain traditional pyramids or rocket ships. Relationships drive revenue, & human empathy, creativity, & negotiation skills remain irreplaceable. The classic span of control models still apply when trust & rapport are paramount.

R&D teams present the greatest opportunity for the short pyramid transformation. Code generation is AI’s first true product-market fit, generating 50-80% of code for leading companies.

Customer success & support might evolve into hybrid models: AI handles routine inquiries while humans manage complex escalations & strategic accounts. The traditional middle management layer transforms into something entirely new.

This evolution challenges everything we know about scaling teams effectively. The old wisdom of 6-7 direct reports breaks down when managers oversee both human reports & AI agents.

The recruiting burden that historically justified management hierarchies transforms too. Instead of finding & developing human talent, managers increasingly focus on configuring AI capabilities & optimizing human-AI collaboration.

If the company ships its org chart, what org chart do you envision for your team?