I think you misunderstand what problem Kamal is trying to solve.
It seems like you have server you’ve setup by hand, you deploy using Cap ( basically SSH ) to that one server and you habitually hack around in production.
Needless to say, a build and deployment pipeline is not for you and will seem rigid and slow.
Kamal is a build and deployment pipeline and like any such pipeline enforces a particular process. If you buy into that process it will achieve a more seamless (reproducible) deployment across a server federation. You just can’t do that with Cap reliably.
I personally use Nix with Cachix ( with a remote build machine ) and while I think my solution is better than Kamal you would find it just as rigid as Kamal.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The TLDR is that this prompt does not improve Claude 3.5 Sonnet to o1 levels in reasoning but it does tangibly improve its performance in reasoning focused benchmarks.
However, this does come at the expense of 'knowledge' focused benchmarks where the model is more directly generating text it has been trained on.
The 'formal logic' and 'college mathematics' benchmarks have significant reasoning focus. OpenAi's o1 excels in these. The use of this prompt with Sonnet also tangibly improves these.
The 'global facts' benchmark, like many other subject matter benchmarks, are much less reasoning focused. They're more about what the model knows and doesn't know. A complex prompt can 'confuse' a model so that even though the model can typically provide the correct answer it under performs because of the prompt.
This is what is happening here with this prompt applied.