and it's wrong. a 6000 line class is not easier for a model to understand. the same things that help humans also help agents. I find myself adding linters that must pass and the agent muss fix that limit file size, function length, function complexity, how many files in a directory. a little more work for the agent, but the codebase is healthier and the agents write fewer bugs.
Parsing single file is easier than navigating a file system for an LLM. Until the models have context windows large enough to hold the entire codebase in one shot, single files will beat multiple files every time.
This. I suspect the codebases in the future will be made of a small number of gigantic source files. These will be able to be transpiled into a more human friendly that produces multiple smaller files per big file in human-debug mode.
As a human who typically uses large files, 10k to 30K lines of code files are pretty common, I find the agents don’t read the whole file after the first time, they almost always do a range select for the bit they are interested in.