MoreRSS

site iconXuanwo | 漩涡修改

ASF成员,Apache OpenDAL PMC主席,Rust贡献者,倡导数据自由。
请复制 RSS 到你的阅读器,或快速订阅到 :

Inoreader Feedly Follow Feedbin Local Reader

Xuanwo | 漩涡的 RSS 预览

How I Vibe Coding?

2025-06-26 09:00:00

Hello everyone, long time no see. I've been evaluating various AI copilots extensively lately and have developed a fairly stable workflow that suits my context and background. I'm now writing it down to hopefully inspire you and to receive some feedback as well.

Background

I'm Xuanwo, an open source Rust engineer.

Open source means I primarily work in open source environments. I can freely allow LLMs to access my code as context, without needing to set up a local LLM to prevent code leaks or meet company regulations. It also means my work is publicly available, so LLMs can easily search for and retrieve my code and API documentation.

Rust means I spend most of my time writing Rust. It's a nice language has great documentation, a friendly compiler with useful error messages, and top-notch tooling. Rust has a strong ecosystem for developer tools and is highly accessible. Most LLMs already know how to use cargo check, cargo clippy, and cargo test. Writing Rust also means that both I and AI only need to work with code in text form. We don't need complex workflows like those often seen in frontend development: coding, screen capturing, image diffing, and so on.

Engineer means I'm an engineer by profession. I earn my living through my coding work. I'm not a content producer or advertiser. As an engineer, I choose the most practical tools for myself. I want these tools to be fast, stable, and useful. I don't need them to be flashy, and I don't care whether they can build a website in one shot or write a flappy bird with correct collision detection.

Toolset

My current toolset consists of Zed and Claude Code. More specifically, I run claude in a Zed terminal tab, which allows me to access both the code and its changes alongside the LLM.

To give claude-code its full capabilities, it's actually running in a container I built myself. Whenever I need to run claude, I use docker run instead. I also have an alias claudex for this purpose:

# claudex
alias claudex='docker run -it --rm \
 -v $(pwd):/workspace \
 -v ~/.claude:/home/user/.claude \
 -v ~/.claude.json:/home/user/.claude.json \
 -v ~/.config/gh:/home/user/.config/gh \
 -v ~/Notes:/home/user/Notes \
 xuanwo-dev'

Mindset

Before introducing my workflow, I want to share my current mindset on LLMs. At the time of writing, I see LLMs as similar to recent graduates at a junior level.

As juniors, they have several strengths: They possess a solid understanding of widely used existing techniques. They can quickly learn new tools or patterns. They are friendly and eager to tackle any task you assign. They never complain about your requests. They excel at repetitive or highly structured tasks, as long as you pay them.

However, as juniors, they also have some shortcomings. They lack knowledge of your specific project or tasks. They don't have a clear goal or vision and require your guidance for direction. At times, they can be overly confident, inventing nonexistent APIs or using APIs incorrectly. Occasionally, they may get stuck and fail to find a way out.

As a mentor, leader, or boss, my job is to provide the right context, set a clear direction, and always be prepared to step in when needed. Currently, my approach is to have AI write code that I can personally review and take responsibility for.

For example, I find that LLMs are most effective when refactoring projects that have a clear API and nive test coverage. I will refactor the service for aws first, and then have the LLMs follow the same patterns to refactor the azure and gcs services. I rarely allow LLMs to initiate entirely new projects or create completely new components. Most of the time, I define the API myself and ask the LLMs to follow the same design and handle the implementation details.

Workflow

My workflow is quiet simple: I arrange my day in 5 hours chunk which aligns with claude usage limits. I map those two chunks to every day's morning and afternoon.

In the morining, I will collect, read, think and plan. I will write my thinking down in my Notes, powered by Obsidian. All my notes is in markdown formats, so LLMs like Claude Opus 4 can understand without any other tools. I will feed my notes to claude code directly, and request them to read my notes while needed.

In the afternoon, I will run claudex inside my projects, as I mentioned earlier. I will monitor their progress from time to time and prepare myself to step in when necessary. Sometimes, I use git worktree to spawn additional Claude instances so they can collaborate on the same projects.

Claude works very quickly, so I spend most of my time reviewing code. To reduce the burden of code review, I also design robust test frameworks for my projects to ensure correct behavior. rust's excellent developer experience allows me to instruct the LLMs to run cargo check, cargo clippy, and cargo test on the code independently. They may need to repeat this process a few times to get everything right, but most of the time, they figure it out on their own.

While reviewing code, I pay close attention to the public API and any tricky parts within the codebase. LLMs are like junior developers. Sometimes, they might overemphasize certain aspects of a task and lose sight of the overall context. For example, they can focus too much on minor details of API design without realizing that the entire approach could be improved with a better overall design. This also reinforces my belief that you should only allow LLMs to write code you can control. Otherwise, you can't be sure the LLMs are doing things correctly. It's very dangerous if the LLMs are working in a direction you don't understand.

In my workflow, I only need claude and zed. claude excels at using tools and understanding context, while zed is fast and responsive. As a Rust developer, I don't have a strong need for various extensions, so the main drawback of zed, its limited extension support, isn't a major issue for me.

Tips

Here are some tips I've learned from my recent exploration of AI agents and LLMs.

Claude 4 is the best vibe coding model (for now)

Claude 4 Sonnet and Opus are the best coding models available so far.

Many people have different opinions on this and might argue: hey, o3, gemini-2.5-pro, or deepseek-r1 are better than Claude 4, they can build a working website in one shot! Unfortunately, I disagree, at least for my needs right now. As a Rust developer, I don't care if a model can build a website or demonstrate strong reasoning. What matters to me is whether it can use tools intelligently and efficiently. LLMs used for vibe coding should have a strong sense of planning and be skilled at coding. A smart model that doesn't know how to edit files can't truly serve as your coding copilot.

I'm not a content creator; I'm an engineer. I need a reliable tool that can help me complete my work. I'm not building demos or marketing materials. This isn't a game or a show that can be restarted repeatedly. I'm working on a project with downstream users, and I have to take responsibility for whatever the LLMs do. I need to collaborate with LLMs to achieve both my goals and my company's goals.

Claude 4 is the right tool.

MCP is a lie

MCP is uesless for vibe coding.

Claude 4 is good at using tools. As long as you let it know that a tool is installed locally, it can use the tool effectively. It can even use --help to learn how to use it correctly. I've never encountered a scenario where I needed to use an MCP server. I tried the GitHub MCP server before, but it performed much worse than simply letting LLMs use the gh CLI locally.

Use tools instead of configuring MCP servers.

Integrate AI into workflow

Integrate AI into your existing workflow instead of adapting yourself to AI.

AI workflows are constantly evolving. Stay calm and add the best tools to your toolkit. Don't change yourself just to fit a particular AI workflow. It's the tool's problem that can't be integrated into your existing workflow.

I've had some unsuccessful attempts at using Cursor or Windsurf. My progress began when I started incorporating Claude Code into portions of my daily workflow, rather than completely switching to a new IDE.

Recommended Readings

Thank you for reading my post. I also recommend the following posts if you want to try vibe coding:

Hope you're enjoying the coding vibes: create more, hype less.

Why S3 ListObjects Taking 120s to Respond?

2025-05-13 09:00:00

Everyone knows that AWS S3 ListObjects is slow, but can you imagine it being so slow that you have to wait 120 seconds for a response? I've actually seen this happen in the wild.

TL;DR

The deleted markers really affect list performance. Make sure to enable lifecycle management to remove them.

Background

Databend is a cloud-native data warehouse that supports S3 as its storage backend. It includes built-in vacuum functions to delete orphaned objects. Essentially, it loads table snapshots to determine all objects that are still referenced and deletes any objects not referenced by any snapshot. As an optimization, Databend writes data blobs to paths containing time-sortable UUIDs (specifically, UUIDv7). This enables Databend to take advantage of the ListObjects behavior, where all keys are sorted in lexicographical order. So, Databend can simply compute a delete_until key and remove all objects with keys less than delete_until.

One day, users reported that the vacuum operation failed due to an opendal list timeout.

6e8a1700-f629-4df4-9596-f9a6508c5f4b: http query has change state to Stopped, reason Err(StorageOther. Code: 4000, Text = Unexpected (persistent) at List::next => io operation timeout reached

Context:
 timeout: 60

OpenDAL is a Rust library that offers a unified interface for accessing various storage backends, including S3. It features a built-in TimeoutLayer that helps prevent problematic requests from hanging forever. The default timeout is set to 60 seconds, and the vacuum operation failed because it exceeded this limit.

Databend has become a complex system, and we've encountered many SQL hang issues in the past. So, my initial thought when addressing this ticket was whether there might be areas where we haven't handled things properly, potentially causing problems for tokio.

Debugging

After a quick review of the codebase, I didn’t spot anything obviously wrong. With no clear culprit in sight, I decided to try and reproduce the issue myself. The affected table had been in use for over a year and had accumulated a significant amount of data. Of course, I couldn’t hope to replicate the user’s dataset exactly, but my aim was to capture the general pattern. If I could demonstrate a significant slowdown in ListObjects under certain conditions, the precise scale would just be a matter of degree.

I direct the AI to use OpenDAL to assist me in generating the code.

By the way, I'm using Zed and Claude 3.7 Sonnet through Github Copilot.

The code is mostly like this:

// S3 configuration
let mut builder = S3::default();
builder = builder.bucket("s3-invalid-xml-test");
builder = builder.region("us-east-2");

let operator = Operator::new(builder)?
 .layer(RetryLayer::new().with_jitter())
 .finish();

...

// Generate a time-ordered UUIDv7
let uuid = Uuid::now_v7();
let key = format!("{}{}", PREFIX, uuid);

// Create an small file with the generated key
match op.write(&key, "hello, world").await {
 Ok(_) => {
 written.fetch_add(1, Ordering::Relaxed);
 Ok(())
 }
 Err(e) => {
 Err(anyhow::anyhow!("Failed to create key {}: {}", key, e))
 }
}

I've tried various patterns, so let's save time by not repeating them and go straight to the problematic one:

  • Have a bucket with versioning enabled.
  • Generate a large number of files (millions or even billions).
  • Delete all of them.

At this point, the bucket is empty. Listing the bucket with the prefix / or /z should, in theory, produce the same result.

What I found was that listing the entire bucket is much slower than listing with the /z prefix. For example, in a bucket where 10 million objects had been deleted, a full list operation would take over 500ms, while listing with /z took only about 8ms. In some cold-start cases, the initial list could take more than 30 seconds to return.

For example:

Starting comparison of latency differences in listing operations with different prefixes
Test parameters: Up to 1000 objects listed per operation, 5 test rounds (including 2 warm-up rounds)

Testing the entire bucket...
Performing 2 warm-up rounds...
 Warm-up #1: Listed 0 objects, time taken: 542.618835ms
 Warm-up #2: Listed 0 objects, time taken: 525.818171ms
Starting 5 official test rounds...
 Test #1: Listed 0 objects, time taken: 536.598969ms
 Test #2: Listed 0 objects, time taken: 539.10924ms
 Test #3: Listed 0 objects, time taken: 531.185516ms
 Test #4: Listed 0 objects, time taken: 536.617262ms
 Test #5: Listed 0 objects, time taken: 537.548909ms
 Average latency for entire bucket: 536.211979ms
 Median latency for entire bucket: 536.617262ms

Testing prefix 'z'...
Performing 2 warm-up rounds...
 Warm-up #1: Listed 0 objects, time taken: 9.004738ms
 Warm-up #2: Listed 0 objects, time taken: 7.567935ms
Starting 5 official test rounds...
 Test #1: Listed 0 objects, time taken: 7.752857ms
 Test #2: Listed 0 objects, time taken: 10.301437ms
 Test #3: Listed 0 objects, time taken: 8.822386ms
 Test #4: Listed 0 objects, time taken: 8.266962ms
 Test #5: Listed 0 objects, time taken: 8.190696ms
 Average latency for prefix 'z': 8.666867ms
 Median latency for prefix 'z': 8.266962ms

====== Latency Comparison Results ======
Entire bucket: Average 536.211979ms
Prefix 'z': Average 8.666867ms
Listing the entire bucket is 61.87 times slower than listing with prefix 'z'
=========================

Also, I can find the cold start of the same bucket can be quiet slow. In some cases, I can find that the warmup needs over 30s:

Testing the entire bucket...
Performing 2 rounds of warm-up...
 Warm-up #1: Listed 0 objects, took 31.881571371s
 Warm-up #2: Listed 0 objects, took 1.243807263s
Starting 5 rounds of formal testing...
 Test #1: Listed 0 objects, took 4.264687095s
 Test #2: Listed 0 objects, took 542.109058ms
 Test #3: Listed 0 objects, took 537.914204ms
 Test #4: Listed 0 objects, took 529.365008ms
 Test #5: Listed 0 objects, took 528.04485ms
 Average latency for the entire bucket: 1.280424043s
 Median latency for the entire bucket: 537.914204ms

Why? Why list objects can be so slow?

Analysis

Let's date back to How S3 Versioning works. After enabling versioning, S3 will create a delete marker for each object you delete. When calling ListObjects, S3 filters out all delete markers and returns only the current versions of the objects.

For example, here is a simple bucket containing only two objects: x and y. x has only one version, while y has two versions.

Actual Storage ListObjects Results
-------------- -------------------
x (v1) x (v1)
y (v1)
y (v2) y (v2)

Obviously, ListObjects will only return x (v1) and y (v2). If we delete y, the result will be:

Actual Storage ListObjects Results
-------------- -------------------
x (v1) x (v1)
y (v1)
y (v2)
y (v3: delete marker)

S3 will add a delete marker as v3 for y and exclude it from the results. The key point is that the delete marker still exists in the bucket, so S3 still needs to check for it when listing objects. I believe the AWS S3 team has explored various optimization methods, but this can still be an issue if your bucket contains a large number of delete markers.

In the most severe cases, such as the following:

Actual Storage ListObjects Results
----------------- -------------------
t1 (delete marker)
t2 (delete marker)
t2 (delete marker)
...
t9999999 (v1) t9999999 (v1)

S3 needs to scan a large number of delete markers before it can return the results. This is why listing object can be very slow, and may even appear as if the HTTP connection is hanging.

S3 has mentioned this in their documentation performance degradation after enabling bucket versioning but they didn't provide a detailed explanation about how the performance degradation happens.

Conclusion

Based on this analysis, we asked users to run aws s3 ls on the same prefix, and they reported that it took 120 seconds to receive the first response. We are aware that AWS S3 ListObjects can be slow, but in certain cases, it can be so slow that it triggers our timeout controls.

My takeaway from this lesson:

  • S3 versioning is not free; only enable it when necessary.
  • Enable lifecycle to remove delete markers and old non-current versions.

BackON v1.5.0 Released

2025-04-09 09:00:00

I am happy to announce the release of BackON v1.5.0.

BackON is a rust library for making retry like a built-in feature provided by Rust.

use backon::ExponentialBuilder;
use backon::Retryable;

async fn fetch() -> Result<String> {
 Ok("hello, world!".to_string())
}

let content = fetch.retry(ExponentialBuilder::default()).await?;

This release adds a new API called adjust(), which allows you to modify the backoff time for the next retry. This is useful when you want to adjust the backoff duration based on the result of the previous attempt or implement a dynamic backoff strategy based on an HTTP Retry-After header.

For example:

use core::time::Duration;
use std::error::Error;
use std::fmt::Display;
use std::fmt::Formatter;

use anyhow::Result;
use backon::ExponentialBuilder;
use backon::Retryable;
use reqwest::header::HeaderMap;
use reqwest::StatusCode;

#[derive(Debug)]
struct HttpError {
 headers: HeaderMap,
}

impl Display for HttpError {
 fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
 write!(f, "http error")
 }
}

impl Error for HttpError {}

async fn fetch() -> Result<String> {
 let resp = reqwest::get("https://www.rust-lang.org").await?;
 if resp.status() != StatusCode::OK {
 let source = HttpError {
 headers: resp.headers().clone(),
 };
 return Err(anyhow::Error::new(source));
 }
 Ok(resp.text().await?)
}

#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<()> {
 let content = fetch
 .retry(ExponentialBuilder::default())
 .adjust(|err, dur| {
 match err.downcast_ref::<HttpError>() {
 Some(v) => {
 if let Some(retry_after) = v.headers.get("Retry-After") {
 // Parse the Retry-After header and adjust the backoff duration
 let retry_after = retry_after.to_str().unwrap_or("0");
 let retry_after = retry_after.parse::<u64>().unwrap_or(0);
 Some(Duration::from_secs(retry_after))
 } else {
 dur
 }
 }
 None => dur,
 }
 })
 .await?;
 println!("fetch succeeded: {}", content);

 Ok(())
}

Hope you enjoy this feature. Thank you, everyone!


As of the v1.5.0 release, BackON is now:

  • Used by 1.5k projects on GitHub
  • Has 50 reverse dependencies on crates.io
  • Downloaded approximately 6.3 million times, averaging 60k downloads per day

Thank you all for your trust—let's make retries feel like a built-in feature in Rust!

Where do you belong, system researchers?

2025-03-10 09:00:00

@xiangpeng published nice post called Where are we now, system researchers? via archive.is. In this post, he questioned the positions of system researchers. Xiangpeng is an outstanding system researcher, and this post is written from the viewpoint of someone in that field. As for me, although I have never been a system researcher, I would like to share some comments here and offer complementary ideas.


There are two main areas in the field of computer science: academia and industry. Traditionally, research is conducted in academia, and its results are applied in industry. However, the boundary between these two domains is becoming increasingly blurred. Sometimes, industry develops something relatively new that ultimately brings significant changes to academia.

Yet, whenever I discuss these developments with friends in academia, they simply laugh at me and say, "That's not new. A paper published in the 1990s already explored this idea."

Ah, the idea. But where is the implementation?

This post said:

We waste too much time babbling about knowledge we learn from papers – how to schedule a million machines, how to train a billion parameters, how to design infinitely scalable systems. Just thinking about these problems makes us feel important as researchers, although most of us have never deployed a service in the cloud, never used the techniques we proposed, and never worked with the filesystems, kernels, compilers, networks, or databases we studied. We waste time on these theoretical discussions because we don’t know how to code and are unwilling to practice. As Feynman said, “What I cannot create, I do not understand.” Simply knowing how a system works from 1000 feet doesn’t mean we can build it. The nuances of real systems often explain why they’re built in particular ways. Without diving into these details, we’re merely scratching the surface.

I think this is a very good point. I've seen many papers that present interesting ideas but are never implemented. Some develop great abstractions but lack practicality. Others propose excellent concepts without discussing how they could actually work. Sometimes, I feel that friends in academia don't really care about real users.

(Writing code does not make you a good researcher, but not writing code makes you a bad one.)

As I stated above, I'm not a systems researcher. I'm curious whether it's possible for a good researcher to be unable to write good code. That said, can someone conduct excellent research without producing any nice code? Are there any examples of this?

The system research community does not need more novel solutions – novel solutions are essentially combinations of existing techniques. When we need to solve a problem, most of us would figure out a similar solution, and what matters is the execution of the ideas.

Instead, we need more people willing to sit down and code, build real systems, and talk to real users. Be a solid practitioner, don’t be a feel-good researcher.

I believe that's a valid point. I'm looking forward to collaborating with more system researchers to push the boundaries of system research forward.

Paper publishing takes too much time. We spend too much effort arguing what’s new and what’s hard, instead of focusing on doing the actual research. Writing a paper already takes too much time, and then we need to anonymize artifacts, register abstracts, wait for reviews, write rebuttals, revise the paper, and can still be rejected for arbitrary reasons. The turnaround time for a single submission can be up to 6 months.

Ah, writing papers is increasingly becoming a specialized skill. I have failed to master it.

In today's world, arXiv is becoming an increasingly important platform for publishing papers and initiating discussions.

The real difference between papers often lies in numerous small details that sound trivial but are actually essential for relevance. In most cases, figuring out these details takes much more time and demonstrates more novelty than coming up with the initial idea itself.

Referring back to my previous comments: Papers are primarily about ideas. I also agree with Xiangpeng that the real difference between papers often lies in numerous small details that may seem trivial but are actually crucial for relevance.

Conclusion

So, back to the title—where do you belong, system researchers? My answer is: open source.

Try integrating your work with open-source projects or publishing it as open source. More and more researchers are doing this, and I believe it's a great trend. One great example is S3-FIFO.

Open source is a great way to share your work with the world and receive feedback from real users. It's also an excellent opportunity to practice coding and build real systems.

MCP Server OpenDAL

2025-03-05 09:00:00

I'm excited to introduce MCP Server OpenDAL, a model context protocol server for Apache OpenDAL™.

Model Context Protocol

Before discussing MCP, we should first establish some background on model context. At its most basic level, a model can be viewed as a pure function that operates like f(input) -> output, meaning it has no side effects or dependencies on external states. To make meaningful use of an AI model, we must provide all relevant information needed for the task as input.

For example, when building a chatbot, we need to supply the conversation history each time we invoke the model. Otherwise, it would be unable to understand the context of the conversation. However, different AI models have different ways of handling context, making it difficult to scale and migrate between them. Following the same way like Language Server Protocol, we can define a standardized interface for model context so developers can easily integrate with various AI models without testing them individually. That's Model Context Protocol.

It's general architecture could be described as follows:

AI tools will function as MCP clients and connect to various MCP servers via MCP. Each server will specify the resources or tools it has and provide a schema detailing the required input. Then, the model can utilize the tools provided by the MCP server to manage context.

MCP Server OpenDAL

Apache OpenDAL (/ˈoʊ.pən.dæl/, pronounced "OH-puhn-dal") is an Open Data Access Layer that enables seamless interaction with diverse storage services. It's development is guided by its vision of One Layer, All Storage and its core principles: Open Community, Solid Foundation, Fast Access, Object Storage First, and Extensible Architecture.

So MCP Server OpenDAL can be used as a MCP server to provide storage services for model context. It supports various storage services such as local file system, AWS S3, Google Cloud Storage, etc. Developers can easily integrate with OpenDAL to manage model context.

This project is still in its early stages, and I'm continuing to learn more about AI and Python. It should be exciting to see how it evolves.

ArchLinux removed deprecated repo community

2025-03-03 09:00:00

Archlinux announced that they will remove the deprecated repository in Cleaning up old repositories via archive.is

Now it's happened.

If you have seen errors like:

:) paru
:: Synchronizing package databases...
 core 115.5 KiB 700 KiB/s 00:00 [####################################################] 100%
 extra 7.7 MiB 8.90 MiB/s 00:01 [####################################################] 100%
 community.db failed to download
 archlinuxcn 1404.2 KiB 4.06 MiB/s 00:00 [####################################################] 100%
error: failed retrieving file 'community.db' from mirrors.tuna.tsinghua.edu.cn : The requested URL returned error: 404
error: failed retrieving file 'community.db' from mirrors.ustc.edu.cn : The requested URL returned error: 404
error: failed retrieving file 'community.db' from mirrors.xjtu.edu.cn : The requested URL returned error: 404
error: failed retrieving file 'community.db' from mirrors.nju.edu.cn : The requested URL returned error: 404
error: failed retrieving file 'community.db' from mirrors.jlu.edu.cn : The requested URL returned error: 404

Please go check your /etc/pacman.conf and remove the section that is deprecated:

[community]
Include = /etc/pacman.d/mirrorlist

All deprecated repositories include:

  • [community]
  • [community-testing]
  • [testing]
  • [testing-debug]
  • [staging]
  • [staging-debug]