Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data Science: Datasets

Resources & Guides on Data Science

Tools for Data Analysis



A dataset (also spelled ‘data set’) is a collection of raw statistics and information generated by a research study. Datasets produced by government agencies or non-profit organizations can usually be downloaded free of charge. However, datasets developed by for-profit companies may be available for a fee.

An “open data” philosophy is becoming more common among governments and business organizations around the world, with the belief that data should be freely accessible. More effort in spurring data driven innovation to support the arrival of the Fourth Industrial Revolution is needed. The Fourth Industrial Revolution is a reality that will bring advancement in Artificial Intelligence (AI), Big Data Analytics, Internet of Things, Biotechnology and Nanotechnology. Malaysian Open Data User Group, a programme organised by Malaysian Administrative Modernisation & Management Planning Unit (MAMPU) in collaboration with the World Bank in 2017 has emphasized the importance of harnessing open data. There is a growing trend where extremely large amounts of data are analyzed for new and interesting perspectives, and data visualization, which is helping to drive the availability and accessibility of datasets and statistics.

Below are a few datasets available for learning purposes :


Databases for learning purpose