It is Monday afternoon, January 19, 2026.
I woke up very late today because I was quite excited last night, tinkering with CZONE and OpenCode. Although the results were not satisfactory, if I hadn’t tinkered, yesterday’s outcome might have been even worse.
Staying up late is just refusing to admit the failure of the day.
— from PH
Last night, I used OpenCode + MiniMax M2.1 to build CZONE (the online version of CZON) from scratch, as documented in this log.
The AI started by asking a series of questions—from technology selection to scaffolding setup, then to feature design, and finally to the CI/CD process. The whole thing went very fast.
Honestly, it was a bit too fast—I felt a bit dizzy (laughs).
But, the crucial but—problems quickly emerged.
I noticed that its understanding of the details regarding GitHub REST API permissions was inadequate.
Well-read and knowledgeable? Not really.
For example, after initializing the repository, we needed to modify the .github/workflows/pages.yml file to add the CZON build steps. This requires the workflow scope permission, but the code provided by OpenCode did not include this permission. A quick glance at the GitHub API documentation would reveal this. Yet, it repeatedly overlooked this detail. Also, GitHub is dumb—the error message was just a 404, with no hint of insufficient permissions. It didn’t realize this issue at all.
During this process, we demonstrated that writing to index.md succeeded, writing to .github/index.md succeeded, and writing to github/workflows/pages.yml also succeeded—only .github/workflows/pages.yml failed. Although the conversation went through multiple rounds because it kept tweaking the code each time, such an obvious pattern—not realizing that .github/workflows/ might be a directory requiring special permissions—shows its scattered attention and insufficient reasoning ability in debugging mode.
I strongly suggest that LLMs themselves or external control frameworks/agents need a Lab Mode. In this mode, the agent should repeatedly design controlled experiments, verify results, and uncover the truth. Sometimes I feel that an LLM is like an unconscious brain—you point somewhere, and it lights up there. Whatever the prompt says, it focuses on that.
Sometimes we want it to be well-read and knowledgeable, and other times we want it to be ignorant yet clear. In a sense, the energy consumed by an LLM is fixed. We hope it can allocate that energy to where it’s most needed for different tasks, rather than distributing it evenly. Recent advancements in the LLM field often adopt this approach.
A Brain in a Vat, Limited in Action
Another important and annoying reason is OpenCode’s lack of self-debugging capability. It has no ability to open and control a browser, so it can only frequently guess, log outputs, and ask me to check the logs. Sometimes I play along, but other times, watching it is like watching my mentee—I have no idea what it’s thinking, and it’s frustrating. I can accept having a clumsy mentee, but I probably can’t accept it being a “brain in a vat without hands”—we still need to find a way to close the loop of its thinking. Google’s Antigravity does a good job in this regard, probably because of the Chrome family connection.
In terms of community solutions, using end-to-end testing frameworks (like Cypress or Playwright) to control the browser should be a good choice. After all, many operations nowadays require browser-side interaction; relying solely on APIs is not enough.
Progress Too Fast, Foundation Unstable
The last point is my own attribution. This time, the AI wrote dozens of files from scratch in less than 10 minutes. Watching it was like watching a printer—it never paused to rest. However, any complex system requires architecture, layering, and ensuring the foundational quality of each module. Only after completing the底层 and thorough testing can you confidently build on top. The AI currently lacks this sense of rhythm—it just prints code. Even if it had built-in debugging capabilities, it might flexibly modify things up and down, but true reliability fundamentally depends on correct concepts, correct abstractions, and correct implementations—it relies on logical coherence and making sense. As for how much time the AI would spend on this, I think it’s still far from enough. Perhaps this is something only a coordination layer can solve; LLMs alone can’t achieve it. LLMs just pave the way, like floodwaters pouring to the lowest point of potential energy.

But many humans are like this too. There’s a classic joke: a plumber fixes a leak here, only for it to burst open there. Treating the head when the head hurts, treating the foot when the foot hurts. In the end, it can’t be fundamentally solved—you just end up rowing a boat in the water.