The TLDR is that this prompt does not improve Claude 3.5 Sonnet to o1 levels in reasoning but it does tangibly improve its performance in reasoning focused benchmarks.
However, this does come at the expense of 'knowledge' focused benchmarks where the model is more directly generating text it has been trained on.