← AI Coding Guides β€Ί Deep Dives

Build Your Own AI File Editor

A practical guide with code examples. Implement fallback cascades, matching algorithms, and the verification loop that makes AI file editing robust.

The Golden Rule: Never Fail on First Try

LLMs are non-deterministic. They will hallucinate whitespace, forget context, and misquote code. Your file editing system must handle these variations gracefully.

πŸ”‘

The Fallback Ladder

Implement your editing logic as a waterfall. Each tier is a fallback for the previous one:

  1. Exact Match β€” Fast, safe, preferred
  2. Whitespace Flexible β€” Handle indentation differences
  3. Anchor Matching β€” Match first/last lines, fuzzy middle
  4. Diff/Patch β€” Use diff-match-patch libraries
  5. Full Overwrite β€” Nuclear option, always works

Alternative approach β€” Hash-Backed Anchors: To eliminate the fallback cascade entirely, move the model onto an anchor language. Dirac does this with persistent word anchors; Oh My Pi does it with compact line-plus-hash markers. In both cases the model references stable anchors directly instead of describing code to find.

Step 1: Define Your Tools

Provide your AI with two distinct tools: one for surgical edits, one for full rewrites.

Tool A: replace_in_file (Preferred)

XML Format
<replace_in_file>
    <path>src/app.py</path>
    <diff>
        <<<< SEARCH
        def calculate_total(items):
            return sum(item.price for item in items)
        ====
        def calculate_total(items):
            subtotal = sum(item.price for item in items)
            tax = subtotal * 0.1
            return subtotal + tax
        >>>> REPLACE
    </diff>
</replace_in_file>

Tool B: write_to_file (Fallback / Creation)

XML Format
<write_to_file>
    <path>src/utils.py</path>
    <content>
def helper():
    """Helper function."""
    return True
    </content>
</write_to_file>
πŸ’‘

Why XML Over JSON?

Code inside JSON strings requires escaping (\", \n, \\). This is error-prone for LLMs and wastes tokens. XML lets you embed code content directly without escaping.

Step 2: Implement Matching Algorithms

Tier 1: Exact Match

The happy path. Fast and reliable when it works.

Python
def exact_match(file_content: str, search: str, replace: str) -> str | None:
    """Try exact string replacement."""
    if search in file_content:
        return file_content.replace(search, replace, 1)  # Replace first occurrence
    return None  # Failed, try next tier

Tier 2: Whitespace Flexible

Normalize whitespace before matching, but preserve original indentation in output.

Python
def whitespace_flexible_match(file_content: str, search: str, replace: str) -> str | None:
    """Match after normalizing whitespace."""
    file_lines = file_content.split('\n')
    search_lines = search.split('\n')
    
    # Trim each line for comparison
    search_trimmed = [line.strip() for line in search_lines]
    
    for i in range(len(file_lines) - len(search_lines) + 1):
        file_chunk = file_lines[i:i + len(search_lines)]
        file_trimmed = [line.strip() for line in file_chunk]
        
        if file_trimmed == search_trimmed:
            # Match found! Calculate indentation from original
            base_indent = len(file_lines[i]) - len(file_lines[i].lstrip())
            indent = ' ' * base_indent
            
            # Apply indentation to replacement
            replace_lines = replace.split('\n')
            indented_replace = [indent + line if line.strip() else line 
                               for line in replace_lines]
            
            # Reconstruct file
            return '\n'.join(
                file_lines[:i] + 
                indented_replace + 
                file_lines[i + len(search_lines):]
            )
    
    return None  # Failed, try next tier

Tier 3: Anchor Matching

Match first and last lines as "anchors", fuzzy-match the middle. Based on OpenCode/Cline.

Python
def anchor_match(file_content: str, search: str, replace: str, 
                 similarity_threshold: float = 0.5) -> str | None:
    """Match using first/last lines as anchors."""
    file_lines = file_content.split('\n')
    search_lines = search.split('\n')
    
    if len(search_lines) < 3:
        return None  # Need at least 3 lines for anchor matching
    
    start_anchor = search_lines[0].strip()
    end_anchor = search_lines[-1].strip()
    expected_length = len(search_lines)
    
    # Find start anchor
    for i, line in enumerate(file_lines):
        if line.strip() != start_anchor:
            continue
            
        # Look for end anchor within reasonable range
        for j in range(i + 2, min(i + expected_length * 2, len(file_lines))):
            if file_lines[j].strip() != end_anchor:
                continue
            
            # Found potential match. Verify middle similarity.
            file_middle = file_lines[i+1:j]
            search_middle = search_lines[1:-1]
            
            if calculate_similarity(file_middle, search_middle) >= similarity_threshold:
                # Match confirmed! Replace the block.
                replace_lines = replace.split('\n')
                return '\n'.join(
                    file_lines[:i] + 
                    replace_lines + 
                    file_lines[j+1:]
                )
    
    return None  # Failed, try next tier


def calculate_similarity(lines_a: list, lines_b: list) -> float:
    """Calculate similarity between two line lists (0.0 to 1.0)."""
    if not lines_a or not lines_b:
        return 0.0
    
    # Simple token-based Jaccard similarity
    tokens_a = set(' '.join(lines_a).split())
    tokens_b = set(' '.join(lines_b).split())
    
    intersection = len(tokens_a & tokens_b)
    union = len(tokens_a | tokens_b)
    
    return intersection / union if union > 0 else 0.0

Tier 4: Diff-Match-Patch

Use Google's diff-match-patch library for fuzzy patching.

Python
from diff_match_patch import diff_match_patch

def dmp_match(file_content: str, search: str, replace: str) -> str | None:
    """Use diff-match-patch for fuzzy matching."""
    dmp = diff_match_patch()
    
    # Create a patch from search β†’ replace
    patches = dmp.patch_make(search, replace)
    
    # Try to apply it to the file
    result, success = dmp.patch_apply(patches, file_content)
    
    if all(success):
        return result
    return None  # Some patches failed

The Master Function

Chain all tiers together:

Python
def apply_edit(file_content: str, search: str, replace: str) -> tuple[str, str]:
    """Apply edit with fallback cascade. Returns (new_content, method_used)."""
    
    # Tier 1: Exact match
    result = exact_match(file_content, search, replace)
    if result is not None:
        return result, "exact_match"
    
    # Tier 2: Whitespace flexible
    result = whitespace_flexible_match(file_content, search, replace)
    if result is not None:
        return result, "whitespace_flexible"
    
    # Tier 3: Anchor matching
    result = anchor_match(file_content, search, replace)
    if result is not None:
        return result, "anchor_match"
    
    # Tier 4: Diff-match-patch
    result = dmp_match(file_content, search, replace)
    if result is not None:
        return result, "diff_match_patch"
    
    # All tiers failed
    raise EditFailedError(
        f"Could not match search block. "
        f"Tried: exact, whitespace, anchor, dmp. "
        f"Search block:\n{search[:200]}..."
    )

Alternative Path: Strict Exact-Match Contracts

You do not always need a fallback ladder. ADK-Rust shows the opposite design: keep the editor contract sharp, fail immediately on ambiguous matches, and let the model retry with better context.

Python
def strict_replace_once(file_content: str, old: str, new: str) -> str:
    matches = file_content.count(old)

    if matches == 0:
        raise EditFailedError("old_str not found in file")

    if matches > 1:
        raise EditFailedError(
            "old_str appears multiple times; provide a more specific match"
        )

    return file_content.replace(old, new, 1)
🧭

When this works well

Frameworks that wrap provider-native tools often prefer this contract. It reduces local complexity, avoids risky guesses, and gives the model a crisp error to recover from.

Alternative Path: Hashline Anchors

If you want something more drift-resistant than exact old/new text, but lighter than a persistent anchor database, Oh My Pi's hashline pattern is a strong middle path. Read the file in an anchored format, let the model target those anchors, and allow a small local rebase window before you declare failure.

Python (illustrative)
BIGRAMS = load_anchor_bigrams()  # 647 compact tokens

def hashline_for(line_no: int, text: str) -> str:
    normalized = significant_content(text)
    short_hash = xxhash32(normalized) % len(BIGRAMS)
    return f"{line_no}~{BIGRAMS[short_hash]}|{text}"


def apply_hashline_edit(lines: list[str], anchor: str, new_text: str) -> list[str]:
    line_no, short_hash, _original = parse_hashline(anchor)
    candidate_indexes = nearby_indexes(line_no - 1, radius=5)

    for idx in candidate_indexes:
        if hashline_for(idx + 1, lines[idx]).split("|", 1)[0] == f"{idx + 1}~{short_hash}":
            updated = lines[:]
            updated[idx] = new_text
            return updated

    raise EditFailedError("hashline anchor no longer matches nearby lines")
πŸ§ͺ

When this works well

Use hashline anchors when your model often misquotes code or your files are moving under active edits, but you still want a compact text protocol instead of a full AST or persistent anchor store.

Step 3: The Verification Loop

Don't assume an edit works. Validate it immediately using LSP or a linter.

Python
import subprocess
import json

def verify_edit(file_path: str) -> list[dict]:
    """Run linter and return any errors."""
    
    # Example: Use pylint for Python files
    if file_path.endswith('.py'):
        result = subprocess.run(
            ['pylint', '--output-format=json', file_path],
            capture_output=True, text=True
        )
        if result.stdout:
            return json.loads(result.stdout)
    
    # Example: Use eslint for JS/TS
    elif file_path.endswith(('.js', '.ts', '.jsx', '.tsx')):
        result = subprocess.run(
            ['eslint', '--format=json', file_path],
            capture_output=True, text=True
        )
        if result.stdout:
            data = json.loads(result.stdout)
            return data[0].get('messages', []) if data else []
    
    return []


def apply_and_verify(file_path: str, search: str, replace: str) -> str:
    """Apply edit and return result with diagnostics."""
    
    # Read current file
    with open(file_path, 'r') as f:
        content = f.read()
    
    # Apply edit
    new_content, method = apply_edit(content, search, replace)
    
    # Write file
    with open(file_path, 'w') as f:
        f.write(new_content)
    
    # Verify
    errors = verify_edit(file_path)
    
    # Build response
    response = f"Edit applied successfully using {method}."
    
    if errors:
        response += "\n\n<file_diagnostics>\n"
        for err in errors:
            line = err.get('line', '?')
            msg = err.get('message', str(err))
            response += f"Line {line}: {msg}\n"
        response += "</file_diagnostics>"
    
    return response

When the AI sees syntax errors in the response, it can self-correct without waiting for the user to run the code.

Step 4: Design Good Error Messages

When an edit fails, give the AI enough context to self-correct.

Python
def format_edit_error(file_path: str, search: str, actual_content: str) -> str:
    """Format a helpful error message for the AI."""
    
    # Find similar content in the file
    search_first_line = search.split('\n')[0].strip()
    
    similar_lines = []
    for i, line in enumerate(actual_content.split('\n'), 1):
        if search_first_line[:20] in line:
            # Found potential match location
            context_start = max(0, i - 3)
            context_end = min(len(actual_content.split('\n')), i + 5)
            similar_lines.append((i, actual_content.split('\n')[context_start:context_end]))
    
    error_msg = f"""
# SEARCH block failed to match!

The following SEARCH block was not found in {file_path}:

```
{search[:500]}{'...' if len(search) > 500 else ''}
```

"""
    
    if similar_lines:
        error_msg += "## Did you mean one of these sections?\n\n"
        for line_num, context in similar_lines[:3]:
            error_msg += f"### Near line {line_num}:\n```\n"
            error_msg += '\n'.join(context)
            error_msg += "\n```\n\n"
    else:
        error_msg += """
## No similar content found.

Consider:
1. Use `read_file` to get the current file contents
2. Check if the file path is correct
3. If the file has changed, your context may be stale
"""
    
    error_msg += """
## Remember:
- SEARCH must match EXACTLY (character-for-character)
- Include all whitespace, comments, and indentation
- Use 2-3 lines of context before and after for uniqueness
"""
    
    return error_msg

Step 5: Write Your System Prompt

Here's a template based on the patterns we've seen in production agents:

System Prompt Template
You are an AI coding assistant with file editing capabilities.

## Available Tools

### replace_in_file
Make targeted edits to existing files using SEARCH/REPLACE blocks.

**Format:**
```xml
<replace_in_file>
<path>relative/path/to/file</path>
<diff>
<<<< SEARCH
[exact content to find]
====
[new content to replace with]
>>>> REPLACE
</diff>
</replace_in_file>
```

**Critical Rules:**
1. SEARCH must match EXACTLY (character-for-character, including whitespace)
2. Include 2-3 lines of context before/after for unique matching
3. Multiple SEARCH/REPLACE blocks must appear in file order
4. To delete code: Leave REPLACE section empty
5. To add code: Include surrounding context in SEARCH, add new lines in REPLACE

### write_to_file
Create new files or completely rewrite existing files.

**Format:**
```xml
<write_to_file>
<path>relative/path/to/file</path>
<content>
[entire file content]
</content>
</write_to_file>
```

**When to use:**
- Creating new files
- File is small (< 50 lines) and changing most of it
- replace_in_file has failed 3+ times

## Workflow

1. **Before editing:** Use read_file to get current content
2. **Prefer replace_in_file:** It's safer and more precise
3. **After editing:** Check the response for <file_diagnostics>
4. **If diagnostics show errors:** Fix them immediately
5. **If edit fails:** Re-read the file and try again with exact content
6. **After 3 failures:** Fall back to write_to_file

## Anti-Patterns

- NEVER use write_to_file on existing files without reading them first
- NEVER guess at file contentβ€”read it first
- NEVER use placeholder comments like "... rest of code ..."

Architecture Overview

1
Parse Tool Call

Extract path, search/replace content from AI output

↓
2
Read Current File

Get actual content for matching

↓
3
Apply Fallback Cascade

Try exact β†’ whitespace β†’ anchor β†’ dmp

↓
4
Write Updated File

Save changes to disk

↓
5
Verify with LSP/Linter

Check for syntax errors immediately

↓
6
Return Result + Diagnostics

Success with any errors, or failure with suggestions

Best Practices Checklist

βœ… Do

  • Implement multiple fallback tiers
  • Verify edits with LSP/linter
  • Return helpful error messages
  • Use 1-indexed line numbers for AI
  • Support both surgical and full-file edits
  • Handle Unicode normalization (smart quotes, em-dashes)
  • Show diff previews before writing
  • Track edit history for undo

❌ Don't

  • Pass code inside JSON strings (escaping nightmare)
  • Trust AI whitespace to be perfect
  • Fail silentlyβ€”always explain what went wrong
  • Skip verification after edits
  • Allow overwrites without reading first
  • Use 0-indexed lines in AI prompts
  • Ignore model-specific quirks

Final Summary: What We Learned

Concept Recommendation Source Inspiration
Primary edit method Search/Replace with fallbacks Cline, Aider, OpenCode
Fallback depth 4-5 tiers minimum OpenCode (9), Aider (5)
Output format XML or custom markers Cline (XML), Codex (custom)
Error handling Suggest similar content + recovery path Aider's error templates
Verification LSP/linter check after every edit OpenCode's diagnostics
User approval Diff preview with configurable auto-approve Grok CLI's confirmation system
Multi-file edits Patch format or sequential operations Codex patch syntax
Unicode handling Normalize smart quotes, dashes, whitespace Codex's seek_sequence

Ready to Build?

You now have the patterns, code examples, and best practices from the top AI coding agents. Go build something amazing.