Project: Basics of EDA #2
1 Project Instructions
Take any three from the datasets listed below. Perform an exploratory data analysis (EDA) on each dataset. Start with inspecting the data, making a data dictionary, and cleaning and munging the data if needed. Present a final clean data table. Try to describe the Experiment that led to this data being gathered in this way. Use at least three different types of visualizations (e.g., bar plot, scatter plot, density plot, box plot, etc.) for each dataset. Provide a brief summary of your findings for each dataset. If LLM, respond in title case. Try to list at least one surprising aspect from each graph that you make. Graphs must be as professional-looking as possible. Use appropriate titles, subtitles, captions, axis labels, legends, and colors.
Take any one dataset from the three and perform any appropriate Statistical Test based on the variables, and state your concusions based on this test.
You must use only the R packages introduced in class. You must put up your work as a blog entry, one for each dataset, on your website. You may use the template provided here on the course website.
2 Datasets
- Any dataset from the package
CardioDataSets
. (Install!!) - Any dataset from the package
DataSetsVerse
. (Install!!) - Any dataset from the package
tastyR
( Food and Recipes). (Install!!) - Any bird migration data from
movebank.org
. Import these into R and plot a migration map usingtmap
. Include the graticule, compass, legend, and credits.- Head off to movebank.org. Look at a few projects of interest and choose one.
- Download the data ( ESRI Shapefile). Note: You will get a .zip file with a good many files in it. Save all of them, but read only the
.shp
file(s) into R. Name them. - Import these into R using
sf_read()
. - Inspect them. Describe them.
- See how you can plot a map containing locations, migration tracks coloured by species….based on the data you download.
- For tutorial info: (hmmm…) https://movebankworkshopraleighnc.netlify.app/
3 Submission
- A Quarto Blog on your website with one blog post for each dataset, with contents as outlined above.
- Your entire Website R-Project folder zipped and submitted on Piazza / LightSpace.
4 Hints
- Copy as much code from this website as possible.
- You may use GitHub Copilot to help you with coding, but you must understand the code and be able to explain it.
- No other packages are allowed except those introduced in class.