My AI Workflow: Skills, Agents, and a Second Brain
How I structure my daily work with Claude Code: skills that become a pipeline, scientific TDD with agents, and an Obsidian vault that remembers what the AI forgets.
Let me guess how you use AI day to day. You open a chat, paste a chunk of code, ask for a function and copy the result. It works, right? Gets you unstuck and you move on.
The thing is, that is not a workflow. It is a trick. And I was stuck in that trick for a good while until something clicked: the real gain is not in the clever prompt. It is in the structure around it.
This is the workflow I put together at TryTech after more than a year of trial and error, and I did not invent any of it from scratch. I kept pulling in ideas from workflows I saw colleagues using, keeping what worked for me and dropping the rest, until it turned into this setup. Three pieces: a written contract, skills that become a pipeline, and an external memory that does not die when the session closes.
The contract: CLAUDE.md
It starts with a text file, simple as that. Claude Code reads a global CLAUDE.md every time it boots, and mine is basically an engineering contract. Ruthless minimalism (the best code is the code you don’t write), small commits in the imperative, validation only at the system boundary, and the golden rule of scientific TDD, which I’ll get to in a second.
You know what this fixes? I stop repeating instructions. The AI arrives already knowing that I’d rather delete than add, that I tolerate duplication until the third time, and that questions get asked all at once, not drip by drip.
Think about it: if you use AI every day and still type “don’t write obvious comments” every single week, you are paying a pointless toll. Write it once, in a file the tool always reads, and you’re done.
Skills: workflows with a name
The heart of the setup are the skills. In Claude Code they work like commands that carry a whole workflow. Instead of explaining the process again every time, I call the process by name. The ones I use the most form a closed development cycle:
/poputs on the Product Owner hat: it explores the codebase, asks the right questions and writes a PRD focused on product requirements, not technical solutions./devtakes that PRD (or an issue, or a loose prompt) and implements it in strict TDD, in three modes: a pair of agents (one drives, the other navigates and keeps it honest), solo, or pairing with me at the keyboard./reviewspins up reviewers in parallel for security, performance and quality, and on top of that throws in a red team auditor that reviews the reviewers, hunting for false positives and inflated severity./qaacts like a second dev: it boots the app, exercises what was built in the browser and validates real behavior, not just green tests./commitwraps it all up with small, conventional commits, written the way a person writes them.
And look, there is no magic here. It is the old discipline of process, just made executable. Each skill is a markdown file describing how I would work if I had infinite time. The difference is that now the AI has that time.
Scientific TDD, or: how not to fall for your own AI
Yes, I know what you are thinking: AI writing the test and the implementation at the same time is the fox guarding the henhouse. I agree with you. That is exactly why the rules exist.
The biggest danger of coding with AI is not that it gets things wrong. It is that it gets them wrong with confidence, handing you something pretty and broken with a straight face. My defense is a TDD that smells like the scientific method: the failing test is the hypothesis, the implementation is the experiment, and there is a step almost nobody does, which is reverting the fix after green just to confirm the test fails again. If it does not go red again, it never proved anything. Ever.
Two rules close the game. Each step changes the test or the production code, never both at once, because that way one side stays honest. And if the bug cannot be reproduced, the session stops and I take over. There is no guessing here.
Want a concrete example? In a recent session on our C web server, this turned into a recipe: a stub that returns -1 to get a real behavioral RED. Why? Because a test that does not even compile because of a missing header is too weak a RED to trust.
The second brain: an Obsidian vault
AI has the memory of a goldfish between one session and the next, and pretending otherwise gets expensive. My way out was to glue the workflow to an Obsidian vault, with three skills looking after it. /note captures an idea or a TIL the moment it shows up, /recap closes each session by recording what I learned, what I decided and what stayed open, and /vault brings all of it back when I need it.
The recap is the silent hero of the story. Every session that matters ends in a structured file: context, what we learned with the technical details done properly, what was decided and why, and the loose threads. Then two weeks go by, I come back to the same subject, and the AI does not start from scratch. It reads the recap and picks up the conversation right where we left off.
You know what? That is how this very post was born. I asked for ideas, and the first thing the AI did was dig through the recent recaps in the vault.
What changes in the end
Let’s be honest: none of this retires the engineer, and I distrust anyone who promises you otherwise. What changes is where my time goes. Writing PRDs, building test suites, reviewing PRs on three fronts, documenting sessions, all of that became executable process. And what is left for me is what was always the real job: deciding what to build, weighing trade-offs and saying no to complexity nobody asked for.
If you find yourself using AI every day with that feeling that you are always starting over, the advice is just one. Stop collecting prompts and start writing your workflows. One file at a time.
Love to you all.