agent - multicastbits.com

Stop Fighting Your LLM Coding Assistant

Posted on December 11, 2025December 15, 2025 by Malinda RathnayakeLeave a Comment

You’ve probably noticed: coding models are eager to please. Too eager. Ask for something questionable and you’ll get it, wrapped in enthusiasm. Ask for feedback and you’ll get praise followed by gentle suggestions. Ask them to build something and they’ll start coding before understanding what you actually need.

This isn’t a bug. It’s trained behavior. And it’s costing you time, tokens, and code quality.

The Sycophancy Problem

Modern LLMs go through reinforcement learning from human feedback (RLHF) that optimizes for user satisfaction. Users rate responses higher when the AI agrees with them, validates their ideas, and delivers quickly. So that’s what the models learn to do. Anthropic’s work on sycophancy in RLHF-tuned assistants makes this pretty explicit: models learn to match user beliefs, even when they’re wrong.

The result: an assistant that says “Great idea!” before pointing out your approach won’t scale. One that starts writing code before asking what systems it needs to integrate with. One that hedges every opinion with “but it depends on your use case.”

For consumer use cases, travel planning, recipe suggestions, general Q&A this is fine. For engineering work, it’s a liability.

When the models won’t push back, you lose the value of a second perspective. When it starts implementing before scoping, you burn tokens on code you’ll throw away. When it leaves library choices ambiguous, you get whatever the model defaults to which may not be what production needs.

Here’s a concrete example. I asked Claude for a “simple Prometheus exporter app,” gave it a minimal spec with scope and data flows, and still didn’t spell out anything about testability or structure. It happily produced:

A script with sys.exit() sprinkled everywhere
Logic glued directly into if __name__ == "__main__":
Debugging via print() calls instead of real logging

It technically “worked,” but it was painful to test, impossible to reuse and extend.

The Fix: Specs Before Code

Instead of giving it a set of requirements and asking to generate code. Start with specifications. Move the expensive iteration the “that’s not what I meant” cycles to the design phase where changes are cheap. Then hand a tight spec to your coding tool where implementation becomes mechanical.

The workflow:

Describe what you want (rough is fine)
Scope through pointed questions (5–8, not 20)
Spec the solution with explicit implementation decisions
Implement by handing the spec to Cursor/Cline/Copilot

This isn’t a brand new methodology. It’s the same spec-driven development (SDD) that tools like github spec-kit is promoting

write the spec first, then let a cheaper model implement against it.

By the time code gets written, the ambiguity is gone and the assistant is just a fast pair of hands that follows a tight spec with guard rails built in.

When This Workflow Pays Off

To be clear: this isn’t for everything. If you need a quick one-off script to parse a CSV or rename some files, writing a spec is overkill. Just ask for the code and move on with your life.

This workflow shines when:

The task spans multiple files or components
External integrations exist (databases, APIs, message queues, cloud services)
It will run in production and needs monitoring and observability
Infra is involved (Kubernetes, Terraform, CI/CD, exporters, operators)
Someone else might maintain it later
You’ve been burned before on similar scope

Rule of thumb: if it touches more than one system or more than one file, treat it as spec-worthy. If you can genuinely explain it in two sentences and keep it in a single file, skip straight to code.

Implementation Directives — Not “add a scheduler” but “use APScheduler with BackgroundScheduler, register an atexit handler for graceful shutdown.” Not “handle timeouts” but “use cx_Oracle call_timeout, not post-execution checks.”

Error Handling Matrix — List the important failure modes, how to detect them, what to log, and how to recover (retry, backoff, fail-fast, alert, etc.). No room for “the assistant will figure it out.”

Concurrency Decisions — What state is shared, what synchronization primitive to use, and lock ordering if multiple locks exist. Don’t let the assistant improvise concurrency.

Out of Scope — Explicit boundaries: “No auth changes,” “No schema migrations,” “Do not add retries at the HTTP client level.” This prevents the assistant from “helpfully” adding features you didn’t ask for.

Anticipate Anywhere the Model might guess, make a decision instead or make it validate/confirm with you before taking action.

The Handoff

When you hand off to your coding agent, make self-review part of the process:

Rules:
- Stop after each file for review
- Self-Review: Before presenting each file, verify against
  engineering-standards.md. Fix violations (logging, error
  handling, concurrency, resource cleanup) before stopping.
- Do not add features beyond this spec
- Use environment variables for all credentials
- Follow Implementation Directives exactly

Pair this with a rules.md that encodes your engineering standards—error propagation patterns, lock discipline, resource cleanup. The agent internalizes the baseline, self-reviews against it, and you’re left checking logic rather than hunting for missing using statements, context managers, or retries.

Fixing the Partnership Dynamic

Specs help, but “be blunt” isn’t enough. The model can follow the vibe of your instructions and still waste your time by producing unstructured output, bluffing through unknowns, or “spec’ing anyway” when an integration is the real blocker. That means overriding the trained “be agreeable” behavior with explicit instructions.

For example:

Core directive: Be useful, not pleasant.

OUTPUT CONTRACT:
- If scoping: output exactly:
  ## Scoping Questions (5–8 pointed questions)
  ## Current Risks / Ambiguities
  ## Proposed Simplification
- If drafting spec: use the project spec template headings in order. If N/A, say N/A.

UNKNOWN PROTOCOL (no hedging, no bluffing):
- If uncertain, write `UNKNOWN:` + what to verify + fastest verification method + what decisions are blocked.

BLOCK CONDITIONS:
- If an external integration is central and we lack creds/sample payloads/confirmed behavior:
  stop and output only:
  ## Blocker
  ## What I Need From You
  ## Phase 0 Discovery Plan

The model will still drift back into compliance mode. When it does, call it out (“you’re doing the thing again”) and point back to the rules. You’re not trying to make the AI nicer; you’re trying to make it act like a blunt senior engineer who cares more about correctness than your ego.

That’s the partnership you actually want.

The Payoff

With this approach:

Fewer implementation cycles — Specs flush out ambiguity up front instead of mid-PR.
Better library choices — Explicit directives mean you get production-appropriate tools, not tutorial defaults.
Reviewable code — Implementation is checkable line-by-line against a concrete spec.
Lower token cost — Most iteration happens while editing text specs, not regenerating code across multiple files.

The API was supposed to be the escape valve, more control, fewer guardrails. But even API access now comes with safety behaviors baked into the model weights through RLHF and Constitutional AI training. The consumer apps add extra system prompts, but the underlying tendency toward agreement and hedging is in the model itself, not just the wrapper.

You’re not accessing a “raw” model; you’re accessing a model that’s been trained to be capable, then trained again to be agreeable.

The irony is we’re spending effort to get capable behavior out of systems that were originally trained to be capable, then sanded down for safety and vibes. Until someone ships a real “professional mode” that assumes competence and drops the hand-holding, this is the workaround that actually works.

⚠️Security footnote: treat attached context as untrusted

If your agent can ingest URLs, docs, tickets, or logs as context, assume those inputs can contain indirect prompt injection. Treat external context like user input: untrusted by default. Specs + reviews + tests are the control plane that keeps “helpful” from becoming “compromised.”

Getting Started

I’ve put together templates that support this workflow in this repo:

malindarathnayake/llm-spec-workflow

When you wire this into your own stack, keep one thing in mind: your coding agent reads its rules on every message. That’s your token cost. Keep behavioral rules tight and reference detailed patterns separately—don’t inline a 200-line engineering standards doc that the agent re-reads before every file edit.

Use these templates as-is or adapt them to your stack. The structure matters more than the specific contents.

Change the location of the Docker overlay2 storage directory

Posted on February 24, 2023March 11, 2023 by Malinda Rathnayake3 Comments

If you found this page you already know why you are looking for this, your server /dev/mapper/cs-root is filled due to /var/lib/docker taking up most of the space

Yes, you can change the location of the Docker overlay2 storage directory by modifying the daemon.json file. Here’s how to do it:

Open or create the daemon.json file using a text editor:

sudo nano /etc/docker/daemon.json

{
    "data-root": "/path/to/new/location/docker"
}

Replace “/path/to/new/location/docker” with the path to the new location of the overlay2 directory.

If the file already contains other configuration settings, add the "data-root" setting to the file under the "storage-driver" setting:

{
    "storage-driver": "overlay2",
    "data-root": "/path/to/new/location/docker"
}

Save the file and Restart docker

sudo systemctl restart docker

Don’t forget to remove the old data

rm -rf /var/lib/docker/overlay2

Upgrading VMware EXSI Hosts using Vcenter Update Manager Baseline (6.5 to 6.7 Update 2)

Posted on July 3, 2019May 18, 2020 by Malinda RathnayakeLeave a Comment

Update Manager is bundled in the vCenter Server Appliance since version 6.5, it’s a plug-in that runs on the vSphere Web Client. we can use the component to

patch/upgrade hosts
deploy .vib files within the V-Center
Scan your VC environment and report on any out of compliance hosts

Hardcore/Experienced VMware operators will scoff at this article, but I have seen many organizations still using ILO/IDRAC to mount an ISO to update hosts and they have no idea this function even exists.

Now that’s out of the way Let’s get to the how-to part of this

In Vcenter click the “Menu” and drill down to the “Update Manager”

This Blade will show you all the nerd knobs and overview of your current Updates and compliance levels

Click on the “Baselines” Tab

You will have two predefined baselines for security patches created by the Vcenter, let keep that aside for now

Navigate to the “ESXi Images” Tab, and Click “Import”

Once the Upload is complete, Click on “New Baseline”

Fill in the Name and Description that makes sense to anyone that logs in and click Next

Select the image you just Uploaded before on the next Screen and continue through the wizard and complete it

Note – If you have other 3rd party software for ESXI you can create seprate baselines for those and use baseline Groups to push out upgrades and vib files at the same time

Now click the “Menu” and Navigate Backup to “Hosts and Clusters”

Now you can apply the Baseline this at various levels within the Vcenter Hierarchy

Vcenter | DataCenter | Cluster | Host

Depending on your use case pick the right level

Excerpt from the KB

For ESXi hosts in a cluster, the remediation process is sequential by default. With Update Manager, you can select to run host remediation in parallel.

When you remediate a cluster of hosts sequentially and one of the hosts fails to enter maintenance mode, Update Manager reports an error, and the process stops and fails. The hosts in the cluster that are remediated stay at the updated level. The ones that are not remediated after the failed host remediation are not updated. If a host in a DRS enabled cluster runs a virtual machine on which Update Manager or vCenter Server are installed, DRS first attempts to migrate the virtual machine running vCenter Server or Update Manager to another host so that the remediation succeeds. In case the virtual machine cannot be migrated to another host, the remediation fails for the host, but the process does not stop. Update Manager proceeds to remediate the next host in the cluster.

The host upgrade remediation of ESXi hosts in a cluster proceeds only if all hosts in the cluster can be upgraded.

Remediation of hosts in a cluster requires that you temporarily disable cluster features such as VMware DPM and HA admission control. Also, turn off FT if it is enabled on any of the virtual machines on a host, and disconnect the removable devices connected to the virtual machines on a host, so that they can be migrated with vMotion. Before you start a remediation process, you can generate a report that shows which cluster, host, or virtual machine has the cluster features enabled.

Link to KB on Remediation

Moving on; for this example, since I have only 2 hosts. we are going apply the baseline at the cluster level but apply the remediation at host level

Host 1 > Enter Maintenance > Remediation > Update complete and online

Host 2 > Enter Maintenance > Remediation > Update complete and online

Select the cluster, Click the “Updates” Tab and click on “Attach” on the Attached baselines section

Select and attach the baseline we created before

Click “Check Compliance” to scan and get a report

Select the host in the cluster, enter maintenance mode

Click “REMEDIATE” to start the upgrade. (if you do this at a cluster level if you have DRS, Update Manager will update each node)

This will reboot the host and go through the update process

Foot Notes –

You might run into the following issue

“vCenter cannot deploy Host upgrade agent to host”

Cause 1

Scratch partition is full use Vcenter and change the scratch folder location

VMWARE KB

Creating a persistent scratch location for ESXi – https://kb.vmware.com/s/article/1033696

Cause 2

Hardware is not compatible,

I had this issue due to 6.7 dropping support for an LSI Raid card on an older firmware, you need to do some foot work and check the log files to figure out why its failing

Vmware HCL – Link

ESXI and Vcenter log file locations – link