- Fill out the script and any needed library code to run
scripts/find_outliers.py dataon your data, and return a list of outlier volumes for each scan (where there is an outlier);
- You should add a text file giving a brief summary for each outlier scan, why you think the detected scans should be rejected as an outlier, and your educated guess as to the cause of the difference between this scan and the rest of the scans in the run;
- You should do this by collaborating in your teams using
We will rate you on:
- the quality of your outlier detection as assessed by the improvement in the statistical testing for the experimental model after removing the outliers;
- the generality of your outlier detection as assessed by the improvement in the statistical testing for the experimental model after removing the outliers, for another similar dataset;
- the quality of your code;
- the quality and transparency of your process, from your interactions on github;
- the quality of your arguments about the scans rejected as outliers.
Your outlier detection script should be reproducible.
That means that we, your instructors, should be able to clone your repository, and then follow simple instructions in order to be able to reproduce your run of
scripts/find_outliers.py data and get the same answer.
To make this possible, fill out the
README.md text file in your repository to describe a few simple steps that we can take to set up on our own machines and run your code. Have a look at the current
README.md file for a skeleton. We should be able to perform these same steps to get the same output as you from the outlier detection script.