3 Comments
User's avatar
Pawel Jozefiak's avatar

I've hit this exact productivity paradox running my own autonomous agent. The agent can generate outputs 10x faster, but the oversight, verification, and integration work actually increased my cognitive load. The infrastructure to make agents truly autonomous isn't there yet - we're in the 'Ferrari with a broken engine' phase. The gap between what demos promise and what production delivers is massive. What I've found works better than progressive autonomy is clear escalation rules. 'You can modify dev, but prod needs approval' is clearer than 'use your judgment.' Has anyone figured out how to measure the oversight burden versus the productivity gain?

Adam Pryor's avatar

There are actually some early efforts of people trying to do that, I'll see if I can find some of the references. I'm not sure any of them are definitive enough though. If I'm remembering correctly, I had pretty strong critiques about sample size and method in almost everything that I've seen so far

User's avatar
Comment removed
Jan 12
Comment removed
Adam Pryor's avatar

The exposure problem is going to become more and more of an issue, I think. As the cost of compute goes up and the desire to hold on to data increases, I think this will be a really key element of moving forward.