AI-Optimised codebases

AI-Optimised codebases

Of late, I've been going full throttle on AI coding, and you could say I've been pushing the limits of what's possible with the current models, and running into their limitations.

These are things you can only figure out through intuition and experience, benchmarks don't tell the whole story.

As my codebases have grown beyond a certain size, I've noticed even the best models, Claude 3.5, Claude 3.7, OpenAI o1-pro, start to struggle.

And this made me think about AI-optimised codebases.

How can we structure a codebase to make it easy for an LLM to work with? Code architecture and developer experience are something I've worked on and advocated for in the past.

I feel like now we're reaching a point where crafting codebases that are easy for AIs to work with will become a priority for many individuals and organisations. Because sure, you can brute force the AI to get the desired results, and models will keep getting better, but you still want to optimise for costs and speed.

A few thoughts here on what an AI optimised codebase could look like:

  1. Small files. Seriously, this is very important. If you're using ChatGPT or Claude via their apps, you do not want to deal with manually applying changes they suggest, and if your files are too large, they absolutely cannot output the full file, and it's going to be a struggle. Even in Cline, all AI models seem to start to struggle the larger a file gets (Cline uses diff based edits by default)
  2. Clear separation of concern. You want unrelated code to be in an unrelated place. You don't want inadvertent changes to unrelated things, you do not want collateral damage. When an AI is focused on one task, if it ends up making changes to an unrelated thing, you might not notice it, you might not test for it, and you'll realise when it's too late and you're deep in a hole.
  3. Comments. Write comments about the WHY, not the WHAT. Just like developers, an AI can look at the code and understand WHAT it is doing, but you should write like JSDoc comments about what a function or file is for, what its purpose is, so that the AI understands that important context when its editing the file. It's like a small inline hint for the AI and can guide it to edit the file within the necessary constraints.
  4. Clean code. AI generated codebases can become a mess if you're not careful. That's not a problem, until it is. You want to stay proactive about keeping a clean architecture, and you might want to use the Plan / Architect modes within the IDEs to plan out the changes on a high level before you implement the changes.
  5. Start new sessions often. As the chat goes on for longer, performance for models seems to deteriorate, they start taking more liberty with your instructions, being more "creative" and forgetful. Keep sessions focused, and sometimes you might need to break down a single feature update into multiple sessions, especially if your complexity is high.
  6. One thing at a time. Don't ask the AI to fix multiple things in one go. Yes, it'd be cheaper, and it'd be worse. AI models of today are not that great at handling multiple edit commands in parallel, especially with complex or convoluted codebases.

These are my thoughts on the topic so far, but I'd love to hear your thoughts and tips if you have any, hit me up over email (in my about page) 😄