**New Research** is an open-source AI accelerator initiative that aims to bring AI to everyone, not just as a product but also as a technology that everyone can touch and contribute to. By promoting open-source innovation, they hope to empower individuals to learn, experiment, and build upon the transformative technology of AI.
Their focus lies in fundamental research that pushes the boundaries of AI with minimal compute resources. Unlike traditional academic approaches where contributions are often incremental, AI is currently in a state where groundbreaking research can be achieved by exploring diverse alternatives and bringing together individuals from various backgrounds.
Their initial focus was the Hermes series of AI models, designed to be neutrally aligned and allow users to instruct the model to adopt any persona, unlike closed providers that enforce guardrails. This individualistic approach empowers users to express themselves and be creative without moralizing constraints. Their team also developed a method called Yarn, which extends the context window of AI models, enabling them to handle larger amounts of text. This research has been widely adopted by other open source AI tools.
Distro is a groundbreaking project that enables the training of highly capable AI models using only a standard internet connection, decoupling performance scaling from interconnect scaling. This innovation addresses a fundamental problem in AI, where training models requires all GPUs to be in the same room due to bandwidth limitations. By overcoming this constraint, Distro aims to democratize AI by allowing anyone to contribute to training state-of-the-art models, regardless of access to expensive, co-located data centers.
Traditionally, only a handful of organizations can train large models due to high interconnect speeds which require large computing power to run (requiring 40,000+ H100 GPUs). Distro has reduced that to what your average computer can do with it's internet connection.
The New Research team's criteria for research is that it should be fundamental and mathematically grounded, allowing for smaller experiments and iterative development. They seek "10x power-ups" that remove blockers and enable multipliers for the open-source community. This philosophy guided the development of Hermes, which leveraged synthetic data to overcome the cost and limitations of human data collection.
The core idea behind Distro is to create a system where the entire world can collaborate to create AI, representative of everyone's contributions. After initially focusing on data collection and Hermes, the team began to ask what if they didn't get Llama 4 for open source use. They realized there was a technical problem (internet bandwidths).
While the initial response to Distro was disbelief, the team has replicated their results, including using the Olmo framework from Alan AI. This involved re-implementing Distro from scratch and testing it against Olmo's baseline, confirming the initial findings.
Distro works by allowing GPUs to train independently with their own data, communicating only the most important information learned rather than synchronizing the entire model. This creates a cloud of models that move together in space, eventually achieving similar performance. The result is almost a 1,000x reduction in bandwidth requirements.
Their findings suggest key pieces of information can be enough of a substitute, unlocking an understanding of how these models learn.
They plan to release a paper detailing Distro. To productize Distro, they plan to focus on building full stack tooling and open it up to community development.
They said they hope for an ideal system based on high performance general purpose computing, and they are taking an engineering first approach that frees themselves of pre-conceived notions of current norms.
While centralized actors can still benefit, the ultimate goal is to empower individual contributors and democratize the AI landscape.