• software practice
  • terraform
  • docker
  • homelab
  • server
  • admin
  • mistakes

Here’s something dumb you can do with Tofu + HTTP state

terraform {
  backend "http" {
    username       = "${var.tfstate_username}"
    password       = "${var.tfstate_secret}"
    address        = "${var.tfstate_host}/client/${var.tfstate_team}/${var.tfstate_project}/${var.tfstate_environment}/state"
    lock_address   = "${var.tfstate_host}/client/${var.tfstate_team}/${var.tfstate_project}/${var.tfstate_environment}/lock"
    unlock_address = "${var.tfstate_host}/client/${var.tfstate_team}/${var.tfstate_project}/${var.tfstate_environment}/unlock"
    lock_method    = "POST"
    unlock_method  = "POST"
  }
}

Then point tfstate_host at the very server your managing with this Tofu config.

Now run a plan that takes down that box’s state server or proxy.


What happens, you might ask?

Tofu cheerfully builds the plan. You don’t think twice. You type:

tofu apply bad-idea.tfplan

It starts applying… and then suddenly throws a big red error:

🚨 “Failed to update state”

Well duh. The HTTP state server is gone. So now you’re in a half-broken world — the proxy’s down, random containers might be gone, and your infra is in limbo.


What do you do to fix it?

Here’s what I ended up doing:

  1. Spun up a new lynx instance manually:

    docker run -d -p 4000:4000 --name tmp-lynx clivern/lynx:latest
    

    (There is more to this command, but basically you can convert your .tf definition to a docker run command, adding exposing port 4000.)

  2. Connected it to the right Docker network so it could talk to my database:

    docker network connect --alias lynx network-lynx tmp-lynx
    
  3. Updated my tfstate_host variable to point to localhost:4000

  4. Unlocked the state manually:

    curl -u username:password -X POST http://localhost:4000/client/team/docker/prod/unlock
    
  5. Reconfigured Tofu:

    tofu init -reconfigure
    
  6. Ran a fresh plan, crossed my fingers that it would recreate the missing containers.

  7. Applied the plan and verified everything came back, especially the Lynx container.

  8. Cleaned up the temp Lynx:

    docker stop tmp-lynx && docker rm tmp-lynx
    

I also added lifecycle rules to my Lynx container in Tofu to prevent deletion:

lifecycle {
  prevent_destroy = true
}

And I wrote a little shell script that handles spinning up a temp Lynx container in case I ever do this again.

The “right” answer is probably to move Lynx out of Tofu entirely… but that’s a problem for another day.

How to reply to this post