Tap any paragraph to write a margin note. Your notes collect in the Desk below the text and file under cases with @. The side-by-side margin rail opens on a larger screen.

Code · BILL · 119th Congress · S. 3952 (Introduced in Senate) — To establish artificial intelligence standards, metrics, and evaluation tools, to support artificial intelligence res... · Sec. 201

Sec. 201. Public data for artificial intelligence systems

1,099 words·~5 min read·/bill/119/s/3952/is/section-201

A research copy — for the controlling text, always check the official state or federal source. Not legal advice.

Title LI of the National Artificial Intelligence Initiative Act of 2020 ( 15 U.S.C. 9411 et seq. ) is amended by adding at the end the following new section: To expedite the development of artificial intelligence systems in the United States, the Director of the Office of Science and Technology Policy (in this section referred to as the Director ) shall, acting through the National Science and Technology Council and the Interagency Committee and in consultation with the Advisory Committee on Data for Evidence Building established under section 315 of title 5, United States Code, develop a list of priorities for Federal investment in creating or improving curated, publicly available Federal Government data for training and evaluating artificial intelligence systems and identify an appropriate location to host curated datasets.
The list developed pursuant to paragraph
(1)shall— prioritize data that will advance novel artificial intelligence systems in the public interest; prioritize datasets that are the result of scientific research that was funded by the Federal Government; and prioritize datasets unlikely to independently receive sufficient private sector support to enable their creation, absent Federal funding. In carrying out subparagraph (A)(ii), the Director shall identify 20 datasets to be prioritized. In developing the list under paragraph (1), the Director shall consider the following: Applicability to the initial list of societal, national, and geostrategic challenges set forth by subsection
(b)of section 10387 of the Research and Development, Competition, and Innovation Act ( 42 U.S.C. 19107 ), or any successor list. Applicability to the initial list of key technology focus areas set forth by subsection
(c)of such section, or any successor list. Applicability to other major United States economic sectors, such as agriculture, health care, transportation, manufacturing, biotechnology, communications, weather services, and positive utility to small- and medium-sized United States businesses. Opportunities to improve datasets in effect before the date of the enactment of the Future of Artificial Intelligence Innovation Act of 2026 . Inclusion of data representative of the entire population of the United States. Potential national security threats to releasing datasets, consistent with the United States Government approach to data flows. Requirements of laws in effect. Applicability to the priorities listed in the National Artificial Intelligence Research and Development Strategic Plan of the National Science and Technology Council, dated October 2016, and subsequent updates, and the priorities listed in Winning the Race, America’s AI Action Plan, dated July 2025. Ability to use data already made available to the National Artificial Intelligence Research Resource Pilot program or any successor program. Coordination with other Federal open data efforts, as applicable. Requirements for researchers funded by the Federal Government to disclose nonproprietary, nonsensitive datasets that are used by artificial intelligence models during the course of research and development. Opportunities for the National Science Foundation to maintain integrated, interoperable, and multimodal datasets readily providing access to scientific and engineering demonstration projects. Before finalizing the list required by paragraph (1), the Director shall implement public comment procedures for receiving input and comment from private industry, academia, civil society, and other relevant stakeholders. In carrying out this section, the Interagency Committee— may establish or leverage existing initiatives, including through public-private partnerships, for the creation or improvement of curated datasets identified in the list developed pursuant to subsection (a)(1), including methods for addressing data scarcity; may apply the priorities set forth in the list developed pursuant to subsection (a)(1) to the enactment of Federal public access and open government data policies; shall ensure consistency with Federal provisions of law relating to privacy, including the technology and privacy standards applied to the National Secure Data Service under section 10375(f) of the Research and Development, Competition, and Innovation Act ( 42 U.S.C. 19085(f) ); and shall ensure that no data sharing is permitted with any country that the Secretary of Commerce, in consultation with the Secretary of Defense, the Secretary of State, the Secretary of Energy, and the Director of National Intelligence, determines to be engaged in conduct that is detrimental to the national security or foreign policy of the United States. Datasets that are created or improved pursuant to this section— shall, in the case of a dataset created or improved by a Federal agency, be made available to the comprehensive data inventory developed and maintained by the Federal agency pursuant to section 3511(a) of title 44, United States Code, in accordance with all applicable regulations; and may be made available to the National Artificial Intelligence Research Resource pilot program established by the Director of the National Science Foundation, and the applicable programs established by the Department of Energy, in accordance with Executive Order 14110 (88 Fed. Reg. 75191; relating to safe, secure, and trustworthy development and use of artificial intelligence), or any successor program. Not later than 1 year after the date of the enactment of the Future of Artificial Intelligence Innovation Act of 2026 , the Director shall, acting through the National Science and Technology Council and the Interagency Committee, submit to the Committee on Commerce, Science, and Transportation of the Senate and the Committee on Science, Space, and Technology of the House of Representatives a report that includes— best practices in developing publicly curated artificial intelligence datasets; lessons learned and challenges encountered in developing the curated artificial intelligence datasets; principles used for artificial intelligence-ready data; recommendations relating to artificial intelligence-ready data standards and potential processes for development of such standards; recommendations for maintaining and expanding the availability of high-quality data sets; recommendations for methods to increase incentives for researchers support by the Federal Government to release high-quality publicly available datasets, that protects against risks to disclosure of personally identifiable information and national and economic security risks; and recommendations for establishing secure compute environments at the National Science Foundation to enable secure artificial intelligence use cases for controlled access to restricted Federal data. Nothing in this section shall be construed to require the Federal Government or other contributors to disclose any information— relating to a trade secret or other protected intellectual property right; that is confidential business information; or that is privileged. Except as specifically provided for in this section, nothing in this section shall be construed to prohibit the head of a Federal agency from withholding information from a public dataset. . The table of contents at the beginning of section 2 of the William M.
(Mac)Thornberry National Defense Authorization Act for Fiscal Year 2021 and the table of contents at the beginning of title LI of such Act are both amended by inserting after the items relating to section 5103 the following new item: 5103A. Public data for artificial intelligence systems. .
Connectionstraces to 4
1 reference not yet in our index
  • 88 FR 75191
Citation graph
cites case law
Sec. 201
Public data for artificial intelligence systems
Fed. Reg.88 FR 75191
Cites 5Cited by 0 across 0 sources
★   the supreme law of the land   ★
Don't Tread on Me
E Pluribus Unum — out of many, one

"If you don't know your rights, you don't have any."

Marginalia · a citizen's law index
A research desk, not legal advice. Always read the cited source before relying on a summary.
Questions or an issue? support@self-law.org
disclaimerMarginalia is a research index, not a law firm. Nothing on this site is legal, tax, or financial advice and no attorney–client relationship is formed by using it. Statutes, regulations, and case law change; summaries, search results, AI output, and member posts may be incomplete, out of date, or wrong. Any interpretation drawn from material on this site should be validated by a licensed attorney in your jurisdiction before you act on it.