Merrek Rosa – Dev Blog: God’s Dev Diary: June 5, 2025

What is this

Every Thursday, I will share a god diary about what we are doing in the past few weeks. I will focus on the interesting challenges and solutions I am facing. I will not be able to cover everything, but I will share what I was interested.

Why do I do

I want to bring our community on this journey, and I like to write the things I have enthusiast! This is my unfilter God Journal, so please keep in mind that what I write here is my thoughts and that you will read this will be old by the time, because many things change quickly. Any plans that I refer to are not set in stone and everything is subject to change. Also, if you do not like spoilers, don’t read this.

A few months ago, I restructured my schedule in an alternative weekly between Space Engineers 2 and AI people. This approach allows me to focus on every project. I’m really the person who needs to dig and work on something instead of changing the context every few hours.

AI people

This week’s meditation was of the AI ​​people, especially to accelerate the development of our AI NPC to explore AI-helping programming (what we call “Vibe coding”). However, instead of diving in planned facilities, we spent a week to improve our method.

Our experience with cursor + Opus 4 and Gemini Pro 2.5 revealed a disappointing pattern. The initial progress may seem promising, but then you hit the wall: request a change, change the AI ​​code, you leave the review and test directly, find new errors or make no correctness, repeat. Hours later, you understand that you are not going anywhere.

The main issue? Current AI agents are basically contacted by SOFTWARE FT Ware Engineering in different ways than experienced developers.

How AI agents work today: You describe a feature → AI reasons briefly find files files → Implemental Change → reveal.

How do expert developers actually work:

  1. Fully understand the requirements and reference
  2. Study the relevant code thoroughly (nothing more, nothing less)
  3. Don’t break complex changes to testable parts
  4. Implement with constant awareness of the wavy effects
  5. Review for edge cases and unwanted results
  6. Update all affected elements – comments, references, documentation, architecture diagrams
  7. Write comprehensive tests and repeat based on results
  8. Running systems for real-time debugging and inspection

Current equipment cannot copy this workflow. They also have a lack of game runtime access cess, cannot enter a diagnostic trace, and miss a holistic point of view that creates great code.

This realization exploring the creation of our own SWE agent. We are studying cloud code, which applies extra capabilities like some of these concepts plus sub-agents.

Major insights from this research:

LLM experienced the most in their domains. Yes, they still have gaps and can only handle minute-long tasks instead of day projects, but in their scope? Opus writes the perfect Tetris game in 4 seconds – a task that will take me days. There is no obstacle intelligence; It is a scarf.

When AI programming fails, it is rarely a model error. It’s insufficient tools around it. I am sure that 2025 revolutionary updates will bring: SWE loop, research, coding, testing, reviewing, evaluating, validation, etc. Special agents; Sophisticated code detection and index, intelligent test automation, multimodal response loops. Analyze Gemini gameplay video to fix the errors autonomously.

We are validating our approach on small codebase and design documents. Design D DOC QS especially reveals – text changes are easier to do trActrak than code changes, immediately out of defective agent behavior.

In case: I asked the cursor to re -format the LOG specifications in our design D Doc. He updated one section, missed the other, left duplicates, never reviewed his work. In the text document, these errors jump immediately. In the code, they will hide between thousands of lines. Classic junior developer behavior – make changes without testing the effect.

How about cost? Sure, it seems expensive to spend $ 100 on a token a day. But what would I take for two weeks if AI delivers in one day? It’s $ 100 cheaper. Plus you are repeated in hours instead of weeks. A clear win.

We are not there yet, but I am confident that this year will bring success. Once we break this, we will dramatically accelerate the development of AI people, run parallel experiments, and repeat at an unprecedented pace.

Space Engineers 2

Looking at my AI agent focus this week, my SE2 Time SE2 went to write a Vision document – a comprehensive guide, which defines requirements, obstacles and KPIs for the team.

Our North Star: Art, code, design, quality, display – Create SE2 mainstream when delivering 10x updates to each dimension.

Key insights: SE2 will match the complexity of SE1 or more, but we wrap it in the Access Casility Layer. New players start with intuitive, managed systems. The complexity of the skills develops manifests itself in progress.

We are also prioritizing attractive gameplay loops and meaningful progress. The complexity is left – it is fun to master rather than forcible to encounter.

https://www.youtube.com/watch?v=9fcbcqn-vbw

Scroll to Top