Dataset Resources
The Data Science Institute is currently collecting a catalog of data science-related datasets that UD researchers have created. This will allow us to highlight these areas of research in grant applications and on the DSI website and will be useful for researchers at UD and other places to discover data that they can use in their research. The goal is not to get an exhaustive list of datasets, but rather to provide some high-level view for the catalog that is expected to grow in the future. You can fill out this survey to help us get started.
External Data Sets
These are some valuable public dataset collections that may be useful:
- US Government’s open data (https://www.data.gov/)
- UC Irvine Machine Learning Repository (https://archive.ics.uci.edu/ml)
- Kaggle (https://www.kaggle.com/datasets)
- Amazon’s AWS Datasets (https://registry.opendata.aws/)
- Collectiohttps://archive.ics.uci.edu/mlns – high quality data and datasets organized by topic (https://datahub.io/collections)
- A collection of public data sets collected by Curran Kelleher for testing out visualization methods. (https://github.com/curran/data)
- Article from Towards AI about Best Public Datasets for Machine Learning and Data Science
- See also COVID-19 related datasets