Hey jononor — we've updated the post to split the training and test sets based on the folds. Good catch and thanks again for reporting this. Some of the experiments in the project will still have the old code, but the blog post will reflect this new train/test split.