AI Coding

Starting a general thread on AI Coding. (this post is a wiki, so feel free to update it)

AI Coding tools

Editor

Terminal

Other/Not sure

Local models

  • Modular Max
    • runs models localy and exposes an API you can use from Cursor, etc.
1 Like

AWS Labs Vibe Coding Tips and Tricks

thanks @khem

After 6 months of using AI coding tools (Cursor and Claude) almost every day, and not just for “toy apps” but actually making things that I have subsequently deployed, I can say a few things:

  1. It’s not just hype. If you are unable to generate a competent full-stack application or microservice with AI coding tools, you’re almost certainly using the tools incorrectly.
  2. Coding LLMs are working with small context sizes compared to the overall challenge space of keeping tens of thousands of lines of code working from commit to commit. You MUST have the AI write tests for everything it generates. You MUST tag every working checkpoint so that it (the AI) can compare working with non-working.
  3. When you tell an AI tool to “plan”, you are working a different part of the model - or even potentially a completely different model - when you are telling it to “execute”. “Deep thinking” models are better at analyzing code and planning than they are at executing. Sometimes even “dumb” models are better at executing to a plan than the smart ones, and they are certainly CHEAPER.
  4. You MUST break planning from execution, just like you would do if you were writing the code yourself, and you must have the AI write planning files that it can follow. If you or your devs are executing plans with the most expensive models, you’re almost certainly just wasting money.
  5. You must AUDIT the code for every feature cycle.

My “flow” is basically:

  • Plan (write this as a .md file, save in plans/ directory)
  • Execute (write the code)
  • Write tests
  • Run tests
  • Re-execute as necessary until tests pass
  • Audit - check the code against Plan.md

You also don’t need a lot of fancy prompts to do any of this. You can literally write out your high-level goals and then have the tool write the plan, then read the plan back to you, correct its assumptions, then proceed with steps 2-6.

The results will surprise you. GIGO applies to AI coding just as much as everything else. If you are sloppy and undisciplined in how you do it, you’ll get predictably bad results.

I thought this was good. Again, the interesting thing here is creating markdown docs that drive the AI. I would add one additional step:

  1. Update README.md and other docs with proposed feature.
  2. Generate the plans/<YYYY-MM-DD_feature>.md from the doc changes.
  3. Continue as above.
1 Like

Code like a Surgeon

I like this perspective.

Sorry, long one: I’m almost 4 weeks into trying Claude. I’ve had some super efficient workflows and I’ve had some horrible failure workflows using agentic coding. I think all of my workflows would be improved if I could treat Claude more like a pair-programming partner and less like someone that I boss around. But this will require a new way of using an IDE to enable it.

What I want to see is like Claude-Code’s existing terminal interface or VScode IDE plugin, so I can write out longer ideas and give Claude a bigger prompt that’s an overview of what we’re going to do. I might use this also to provide some initial info about a plan for changes I’d like to make. This already exists and it works well.

Then I want Claude to generate its code/docs/tests/whatever and make a git commit on a wip-branch and show me that commit effectively as tabs for each file. The git commit message will include my original request, any back-and-forth conversation we had, and then explain in detail how it interpreted my request and why it made the changes it made. Then I can review the changes.

Within the tabs representing each file in the git commit, I want to be able to interactively select text and start a conversation tree which branches off from either the root of our conversation or another branch of our conversation, so I can request changes or ask more detailed questions. The idea of having it be a conversation tree with branches is so that I can preserve as much context as possible when I’m exploring the agent’s proposed changes. During this, I can also directly edit the changes in the git commit (but this won’t modify the existing commit).

I want this style of work because I’ve found that with the current CLI tools I end up having a variety of feedback on each change the agent proposes but many times each of my feedback tidbits has nothing to do with other tidbits. Sadly, since it’s a linear conversation this all fills Claude’s context and hurts agent performance.

Based on my tree of feedback, conversations, and edits, then another git commit is made with all of the new conversation trees embedded in the commit message and any of my changes as actual file changes. Possibly each conversation branch becomes a git commit, so things are more granular, but my main idea is to capture the conversation which lead to the change.

From there, Claude replies by making its changes based on my feedback, and again makes a git commit explaining why. We iterate like this as long as needed, making a HUGE number of git commits. This becomes it’s own line of context which we can both review.

Once the full set of changes is acceptable, then we make a final marker commit marking this line of work as being done and we transition to figuring out how to structure the actual git commits so we can share them with others, like for making a pull request. The full stream-of-consciousness set of commits is made just to preserve context and allow us to easily see how we’ve iterated, it’s not meant as the final work. We’ll clean up the commits in a similarly iterative way, having conversations about how to structure them and possibly iterating on this process a few times (how to record this iteration isn’t yet clear to me).

If I’m sitting next to a human, pair-programming, this is basically the flow. We have a conversation about what we’re going to do, one person writes some code/docs/tests/whatever, and the other then looks at it, points at it, asks questions, makes edits, and the iteration begins.

The current CLI agentic coding tools work great if the agent can implement what you’re asking on the first or second try, but if you have to iterate tens of times on a non-trivial quantity of code or documentation, it becomes a burden on the human to verify every set of proposed changes by the agent haven’t screwed something else up compared to their last proposal and as the conversation grows in linear length the agent’s context gets polluted. By having the conversation tree and restricting each conversation branch to a subset of the code with each change being its own commit, it becomes easier to follow what’s changing in each iteration and for the agent to preserve as much context as possible.

1 Like

Interesting ideas! I like the pair idea as well.

I wondered if there are any tools out there that do anything like this:

https://www.perplexity.ai/search/are-there-any-ai-tools-that-im-b54A7CIzSjmPhmDLw7H9Rg#0

One of my associates likes writing pseudo code and comments in the actual code to drive AI. This provides a little more precise way to provide guidance and gets you closer to interacting at the code level. (Jack, you care to comment on this?)

I should add a command to my plugin to support pseudo/comment driven changes.

Perhaps the flow could be:

  1. Document
  2. Plan
  3. Implement tests (TDD)
  4. Pseudo (modify/improve tests by inserting comments/pseudo code)
  5. Implement
  6. Pseudo (modify actual code by inserting comments/pseudo code)
  7. Get tests passing
  8. Update docs/plan (to reflect actual implementation)

I find the more I can avoid prompting, the better the experience.

I am a big fan of state machines, especially as a way of visualizing how my inputs should equal my outputs. Typically, I break down a task as follows (note, this is me, not Claude):

  • What is the problem I am trying to solve

  • What are my inputs and outputs (or a subset of them to visualize)

  • What is the smallest, repeatable unit I can create

    • i.e. is there a function that I can create that will be used 10 times, versus rewriting the same block of code over and over again
  • What are these small, repeatable units’ inputs and outputs

Now, I have the ability to write pseudocode as to how this will work! Note that I still documented the problem I am solving in step 1 (bullet point 1? oops). This can go into a documentation markdown file if it is important enough, or it can get passed along to Claude as context, where we ask it to update the README as step 1.

Next, I write my Pseudocode. I do full function definitions, explanations of what that function should accomplish, what the input and output is, and some example cases. The advantage of laying this out within the file is you can put // START FUNCTION X and // END FUNCTION X, and Claude will keep all code related to that function it is adding between those comments.

Finally, I go to Claude, and typically I give it something along the lines of:

I am trying to implement X feature, the required inputs and outputs are described in @Y.md, and the function definitions are outlined in @Z.c . Please implement tests function W (I ran out of letters) to ensure that we catch all edge cases. Then, please implement function W and confirm it passes the tests you wrote. When you are finished outlining all of the edge cases, please create a table for me in temp.md so I can ensure all cases are covered.

By breaking it down for Claude so that I am not requiring it to one-shot an entire feature, but rather a single function, I have found:

  1. Claude produces less redundant, repeat code

  2. Claude can easily implement 1 function VERY well, when it struggles to implement 20 at once

  3. I, as the coder, wrote zero actual code, but I understand exactly how the functions work

  4. If I need to review logic and jump in to help, I am not attempting to understand ALL of this code that Claude just generated, but rather just that one function.

Just my work flow, but I feel like it works pretty well, and as someone who reached professional age when these tools were extremely prevalant, it is how I rationalize with not memorizing any one language, but rather memorizing how their architectures allow us to simplify things (like state tables in C, or components in React).

Jack

Hey @jsnapoli1, welcome to TMPDIR!

I was wondering about tags in the Pseudocode so Claude knows where to find it. I’m thinking of something similar in the upcoming pseudo command in my Claude plugin.

You might be able to wrap this up in a Claude command – tell it to use git diff to find the doc changes, and pseudo code automatically without having to name out the files.

I wonder if you could give it order too. Tell it to implement this function first, then this one, and so on and so forth.

You could probably number the tags in the pseudo code sections, and put instructions in the pseudo command to do them sequentially and then stop after every function (or step) is implemented for inspection. I like “step” as not everything fits into a function.

And the flow might look like:

  • /pseudo
  • inspect and fixup code, write more pseudocode
  • /pseudo
  • inspect and fixup code, write more pseudocode
  • … repeat as long as necessary to implement the feature

It seems vibe coding is having some reality check

1 Like

This is a concept I’ve had in mind, but haven’t been able to verbalize very well.

The irony is that we throw out programming languages, and then have to re-create them to express and organize complex concepts …

A “pragmatic” view on AI …

Steve claims that AI Coding works very well if you know how to keep it on the rails. He also advocates the terminal apps. I agree.

And as experienced developer, like it’s amazing.

And now I understand why Ken Beck is saying in 52 years he’s never felt this good about or this excited about writing code.

A lot of your listeners listening to this right now have no idea what you’re talking about because they don’t.

They haven’t actually tried the terminal app versions of these things like source graph AMP and Claude Code and codecs from Open AI or Klein, right?

You know, and by the way, Klein is going to start taking on real, real, real importance, being able to run local models as soon as local models reach where cloud sonnet is today.

Because cloud sonnet is very viable if you keep it on the rails.

Because look, let’s face it, the reason people are screwing this up and saying this doesn’t work and I don’t understand why AI works and all these stories are BS.

Gergely (01:07:10): “These AI agents can write a lot of code. And I’m wondering — is it good code? Is it the code that you actually want?

Steve: Nobody’s born knowing how to do it. It’s completely new to humanity to have this sort of human but non-human, distinctly different helpers. And the best advice that I can possibly give you is to give them the tiniest task, the most molecularly tiny segmented task you can give them. And if you can find a way to make it smaller, do that at a time. Keep it real careful, track with them on what they’re working on at all times, and then own every line of code that they’ll ultimately commit.

You cannot trust anything. And that means multiple safeguards and guardrails and sentries and security and practices. And you have to train yourself to say the right things, do the right things, and look for the right things. And it is not easy.

It has reinforced my belief that people who are really good developers are going to thrive in this new world because it takes all of your skills to keep these things on the rails.

This article includes numerous quotes from industry; it appears there was an inflection point in late 2025 with AI coding.