The U.S. National Science Foundation and Nvidia have pledged $152 million for the Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project. Led by the Allen Institute for AI (Ai2), this five-year effort aims to close the resource gap for academic researchers by creating powerful, open-source AI models trained specifically on scientific data and literature. Ai2 will release the models with their complete weights and training data, promoting transparency and reproducibility. The project will accelerate scientific discovery and support workforce development.

The U.S. National Science Foundation (NSF) and Nvidia have jointly pledged $152 million to support an initiative focused on transforming American scientific research. This public-private partnership will fund the Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project, a five-year effort led by the Seattle-based Allen Institute for AI (Ai2). The NSF will contribute $75 million, while Nvidia will provide $77 million in funding and infrastructure. The collaboration aims to develop a comprehensive suite of advanced artificial intelligence models specifically designed to support the U.S. scientific community, thereby promoting discovery and strengthening the nation’s leadership in AI-driven innovation.
This investment directly tackles a widening gap in AI development. The rising financial and computational costs to create and research advanced AI models have increased sharply, making them inaccessible to many university labs and federally funded researchers. This resource gap limits the scope of academic inquiry, even though these researchers have historically driven many key breakthroughs in AI. The OMAI project seeks to close this gap by offering the academic community access to powerful, transparent, and specialized AI tools. This initiative supports priorities outlined in the White House AI Action Plan, which emphasizes speeding up AI-enabled science and promoting the creation of open models to benefit both business and academic research.
Under Ai2’s leadership, the OMAI project will develop and release a family of open-source, multimodal large language models. Unlike general-purpose AI, these models will be trained specifically on large collections of scientific data and literature. The term “multimodal” refers to the ability to process and combine information from various formats, including text, images, and complex datasets. Noah Smith, a senior director at Ai2 and a computer science professor at the University of Washington, will oversee the project. He sees opportunity in the shift AI is making from research tool to being a catalyst that transforms how discoveries are made across all fields.
The fully open design of these models is a central part of the initiative. Ai2 plans to release the models along with their complete weights, training data, code, and evaluation tools. This level of openness stands in stark contrast to many leading AI systems that remain proprietary or partially open. By providing full access, the OMAI project allows scientists to examine, modify, and retrain the models to suit their specific needs. This transparency fosters trust and enables researchers to replicate results, which is vital for rigorous scientific work. Building on its experience with developing high-performance open AI models like Allen’s Olmo open language model and its Molmo open vision model, Ai2 will primarily use the new funding for the computing resources required to train larger, more advanced models based on these open foundations.
Nvidia’s contribution goes beyond financial support to include essential technological infrastructure. The company will provide its HDX B3-100 systems, built on the new Blackwell Ultra architecture, along with its AI Enterprise software. This hardware is designed to process massive datasets efficiently, significantly speeding up the training and inference of new models. This technological foundation will advance the entire cycle of scientific discovery, from data analysis to hypothesis development.

The project’s applications will cover many scientific fields. Researchers will use these tools to process and analyze data faster, generate code for complex simulations, and create detailed visualizations. The systems will help scientists detect subtle patterns they might otherwise overlook and connect new insights to past discoveries, even highlighting relevant findings from outside a researcher’s immediate specialty. Initial uses will focus on speeding up the discovery of new materials, enhancing protein function prediction for biomedical progress, and fixing known issues in current large language models.
OMAI initiative
The OMAI initiative also features a key focus on workforce development. The project will support training efforts to expand AI participation and expertise beyond traditional technology hubs. This educational outreach enhances American competitiveness in AI and other vital technologies by developing a national, AI-ready workforce. Research teams from the University of Washington, the University of Hawaii at Hilo, the University of New Hampshire, and the University of New Mexico will collaborate on the project, further spreading expertise nationwide. The project will proceed in phases, with datasets, code, and other resources released gradually. Ai2 expects the first major model will be available roughly 18 months into the five-year program, representing a concrete step toward a new era of AI-driven scientific discovery for the benefit of all Americans.
FlexOlmo The Allen Institute has developed the FlexOlmo platform, which enables data owners to contribute to the shared model without directly sharing their raw data. Additionally, data contributors retain control over how their data contributions are used. They can see when their data contributions are active in the model and can deactivate them at any time.
FlexOlmo is related to cross-silo federated learning and can be applied to applications that need cross-silo federated learning. However, it differs fundamentally by allowing each data owner to train locally in complete isolation and asynchronously, with the flexibility to opt in or out at any time—leading to significant technical and logistical differences.

to securely work with public models and their private data.
No free lunch
However, a study by Nous Research discovered that open-weight models utilize approximately 1.5 to 4 times more tokens—AI computation’s fundamental units—compared to closed systems from organizations like OpenAI and Anthropic. During basic information queries, this efficiency disparity expanded exponentially, with certain open models requiring up to 10 times more tokens to produce comparable responses.
The analysis demonstrates that open-source artificial intelligence systems require substantially more computational power than proprietary alternatives when executing equivalent tasks. This resource inefficiency could erode the economic benefits of the open approach and influence enterprise AI implementation decisions.