This page is under construction for the new release 

The SDB serves six datasets:


PubMed, a service of the National Library of Medicine, includes over 19 million biomedical articles. The MEDLINE wiki page gives more detailed information about the dataset, including the table schema and data coverage.

U.S. Patent and Trademark Office Patents

For over 200 years, the United States Patent and Trademark Office (USPTO) has been processing and disseminating patent and trademark applications and information to promote an understanding of intellectual property protection and to facilitate the development and sharing of new technologies worldwide. SDB patent data prior to 1996 was generously made available by Steven A. Morris, Electrical and Computer Engineering, Oklahoma State University. Patent data from 1996 to present was downloaded from The USPTO wiki page gives more detailed information about the dataset, including the table schema and data coverage.

National Science Foundation Awards

The National Science Foundation (NSF) funds research and education in science and engineering. It does this through grants, contracts, and cooperative agreements to and with more than 2,000 colleges, universities, and other research and/or education institutions in all parts of the United States. The NSF wiki page gives more detailed information about the dataset, including the table schema and data coverage.

National Institutes of Health Awards

RePORT (Research Portfolio Online Reporting Tools) is a searchable database of federally funded biomedical research projects conducted at universities, hospitals, and other research institutions. The database, maintained by the Office of Extramural Research at the National Institutes of Health, includes projects funded by the National Institutes of Health (NIH), Substance Abuse and Mental Health Services (SAMHSA), Health Resources and Services Administration (HRSA), Food and Drug Administration (FDA), Centers for Disease Control and Prevention (CDCP), Agency for Health Care Research and Quality (AHRQ), and Office of Assistant Secretary of Health (OASH). The NIH wiki page gives more detailed information about the dataset, including the table schema and data coverage.

Clinical Trials

The National Institutes of Health runs, a central registry of clinical trials.  The database includes both publicly and privately funded studies around the world. was established by law in 1997 and made available to the public in 2000. The database includes data on trials, related diseases or conditions, interventions, eligibility criteria, locations and contacts. The Clinical Trials wiki page gives more detailed information about the dataset, including the table schema and data coverage.

National Endowment for the Humanities Awards

The National Endowment for the Humanities was created in 1965 to award grants to promote excellence in the humanities and awareness of the lessons of history.  It is one of the largest funders of humanities programs in the United States.  The NEH wiki page gives more detailed information about the dataset, including the table schema and data coverage.

Table 1: Number of records per data set and years covered


# Current Records

Years Covered

Regular Update





USPTO Patents




NIH Awards*




Clinical Trials288,2801900-10/26/2018Yes

NSF Awards




NEH Awards47,1971970-2012No





*The number of NIH awards was not aggregated by base project, it includes subprojects. Some projects have up to 3,000 subprojects.

The number of papers/patents/grants per publication year and grant award year is given in the below chart.