Google Open Data

Search for specific Data using Google

Google launched a search engine in 2018 to help scientists find the datasets that they need.

The service was launched in 2018 as a companion to Google Scholar.

+1 for open data!

The Google Dataset Search Provides access to critical datasets, including data from federally funded research, per a White House Office of Science and Technology Policy (OSTP) that requires “the results of taxpayer-supported research immediately available to the American public at no cost.” (OSTP)

This means that we will be able to perform a Google search for data from cancer research, NASA, NOAA, Harvard’s Dataverse, ProPublica, Kaggle, etc. All in one spot!

What is the goal of Google Open Data?

According to a Google blog post from January 2023, “The aim is to unify the tens of thousands of different repositories for datasets online. ‘We want to make that data discoverable, but keep it where it is.’”

The approach is based on an open data standard for describing a dataset, meaning anyone who publishes data can describe their data using the standard and make it discoverable, according to a 2018 blog post by Natasha Noy.

Why is open data important?

Open Data is extremely important for the reproducibility of scientific results. There is a well-documented reproducibility crisis in science where scientific studies are hard or impossible to reproduce.

Having access to the data underlying the results helps scientists work together and independently confirm and check results that are published. This builds a stronger foundation of scientific understanding.

Open data is also important when the data is derived from public sources or federal/state/local funding. Then, the data rights really belong to the public and it’s important that any individual be able to access the data.

Tags

#opendata