SUMM

Coding agents can be very effective but also have notable limitations and challenges.
The job of software engineers is undergoing rapid change; routine code writing is diminishing as AI takes on more of the development process.
The core value of engineers is increasingly about critical thinking, solving high-level problems, and architectural planning, rather than just writing code.
AI agents are strong at the iterative code-run loop but less capable when empathizing with users or aligning with big-picture business goals.

The term "agent" implies autonomy: taking actions in the real world.
Agents simulate the core tools of a software engineer—code editor, terminal, and web browser—to complete development cycles.
The evolution from tools like GitHub Copilot (contextual autocomplete) to more autonomous agents like Open Hands, which perform complex, multi-step tasks from brief instructions.
This shift offers greater efficiency and allows developers to delegate multiple tasks.

Agents are structured as loops between a large language model (LLM) and the external environment.
The LLM interprets tasks, takes actions (e.g., editing code, running commands, browsing web pages), receives feedback, and iterates until the goal is met.
Efficient code editing uses diff-based or "find and replace" approaches rather than regenerating entire files.
Terminal interaction brings challenges like handling long-running commands or parallel tasks.
For web browsing, performance improves by parsing only essential content (e.g., accessibility trees, markdown) or interpreting labeled screenshots for interaction.
Sandboxing (using Docker containers) is essential to prevent agents from causing harm or accessing unintended resources.
As agents interact with third-party APIs, it's crucial to tightly scope credentials and follow the principle of least privilege.

Begin with small, quick tasks that have clear, verifiable criteria for success.
Assign agents rote tasks (e.g., fixing tests, resolving lint errors, or handling merge conflicts).
Over time, as familiarity grows, agents can be trusted with larger and more complex tasks.
Giving clear, detailed instructions—including desired methods, frameworks, files, or functions—improves agent performance and efficiency.
Embrace a "code is cheap" mindset: rapidly prototype, discard failed attempts, and iterate freely.
If agent output is off-target, it's best to restart with a fresh prompt rather than tweaking unsuccessful results.
Always review agent-generated code to avoid unchecked technical debt and codebase degradation.
Maintain a human in the loop for accountability—ensure individuals take ownership of AI-generated pull requests and are responsible for merging and resolving issues.

Resolving merge conflicts, especially in fast-moving codebases, is highly efficient with agents.
Addressing pull request feedback by delegating clearly articulated changes.
Fixing small bugs (e.g., changing input types) directly from communication platforms like Slack.
Making infrastructure changes—including updates requiring reference to specialized documentation—by leveraging agent knowledge and web browsing.
Performing database migrations with best practices (indexes, foreign keys), a task agents handle well.
Fixing failing tests on PRs, especially when issues arise from simple API changes.
Expanding test coverage in targeted parts of the codebase to safely improve overall quality.
Building internal applications quickly where strict code review is less critical, enabling faster innovation and prototyping.

Software development agents thrive on well-scoped use cases and can improve both productivity and developer satisfaction.
Developers are encouraged to join and contribute to the OpenHands community on GitHub, Slack, and Discord.

Software Development Agents: What Works and What Doesn't - Robert Brennan, AllHands/OpenHands