We are proud to announce EarthFrame, our Public Benefit Corporation with a mission to bring transparency, sustainability, ease of use, efficiency, and resilience to Sovereign AI and the data economy. EarthFrame supplies the building blocks for stewarding Earth’s most precious data such as that of scientific, linguistic, cultural, or historical importance.
EarthFrame
EarthFrame began from a question asked of us by a community that would become one of our first customers:
“How do we maintain sovereignty over our data for generations and ensure that doing so does not drain our natural and economic resources?”
We recognized the immense challenge and responsibility of tackling such an endeavour. Even as life scientists used to working with some of the largest, most valuable, and most sensitive datasets on the planet, simultaneously co-optimizing for performance, longevity and efficiency to build what this community needed was a level above our previous work.
To ensure we were improving upon the state of the art, we would need to benchmark our system against existing local and cloud options. This presents its own unique challenges as there is an alarmingly small amount of transparency in the AI economy – even as we know the capacity in Mega- or Gigawatts of a data center, we have next to zero knowledge of the real-time energy (or water) usage of these systems as we interact with them.
Thankfully, our friends, cofounders, and EarthFrame family are a scrappy bunch, especially when propelled by our values and our life experience. We started at both ends of the stack: at the top level, we built new, superpowered data management tools and easy-to-use software for benchmarking and measuring the energy usage of compute hardware. At the bottom, we went all the way back to the power source and built our hardware to be quantized to match common solar panel deployments while still providing great performance.
We designed everything to be repairable, portable, and durable, so it serves the diverse needs of our potential customers. We even built one of our prototypes outside in a shopping cart and pushed it around while installing Ubuntu on solar power. The kids in the local park loved it. How’s that for “building in public?”
So, where are we today?
We are open for business, offering three compute systems available to preorder, an open-source resource transparency toolkit, and a new archive format for storing, archiving, and interacting with your data.
EarthFrame Computing Systems
EarthFrame offers three fast, efficient hardware systems at different scales, capable of running state-of-the-art local models for inference and fine-tuning, doing blazing fast genome alignment and variant calling, data archival, retrieval augmented generation, search, transcription, text-to-speech, and much more.
- The Ember Fits in a backpack and can run on a single 200-to-300 Watt solar panel. Engineered for researchers, developers, and teams who need serious performance without the desk footprint.
- The EarthFrame-1 A powerful workstation or rackable single node, bringing up to 384GB of VRAM with a design power envelope that should work with most standard household outlets.
- The Mesa A complete 5-node cluster with integrated networking and scheduling, serving as a cluster-in-a-box offering that can scale to meet the needs of almost any community that wants to own, manage, govern, and monetize their own data.
Warpt: Transparency for AI Hardware
To bring transparency to AI energy utilization, we also built Warpt, our workload and resource power toolkit. Warpt is a Python package that helps identify the hardware and software in your system, whether it’s an EarthFrame system or another manufacturer’s laptop, workstation, or cloud VM. The output of this can then be used to optimize benchmark or workload performance and identify bottlenecks in your system. For real-time power monitoring, warpt abstracts the power monitoring utilities of various manufacturers into a single unified, structured, coherent report, just like we do for system configuration. The reports it generates make it easy to share your exact configuration with others in a programmatic manner, helping facilitate reproducibility. Warpt is like being able to crack open your laptop and peek at what’s inside and how much power it is using, and it’s available today for Linux and Mac OS X: just pip install warpt. Need a feature or find something that needs to be fixed? Warpt is also open-source under an MIT License and available on GitHub.
Mar: An Archive Format for the 21st Century
Another brick in our data sovereignty foundation is Mar, an archive format for the 21st Century. Mar improves upon existing archive formats by adding efficient random access, fast built-in checksums for data verification, and support for modern compression standards like ZSTD. Mar aims to be easy to use and is both very tunable by the end user and supportive of simple presets. Mar supports native similarity search, semantic search of archives, easy and fast checksum comparison, and data redaction. We hope Mar becomes essential infrastructure for the next decades of computing, and it is also available open-source on GitHub.
What's Next
As a company, we are very early in our journey. We have more exciting releases planned through the end of this year. We also know we can’t and shouldn't build everything ourselves (we are just one tiny company building computers in our garage like it's 1976, after all). We look forward to working together and hope you find us an eager partner in not only your success but in the success of the ecosystem as a whole. To learn more about our values, check out the About section on our site over the coming weeks.
If you want to help, get in touch – we would appreciate your support.
If you want to get started with infrastructure building blocks for managing your own data, or that of your community, we think EarthFrame is the best place to start. We’ll provide honest guidance on the best solution for your needs, whether that’s us or another partner. We have immense respect for existing manufacturers (including those who share our hometown of Austin, Texas) and other values-driven computing companies.
You can catch us in person this Spring at
- NVIDIA GTC March 16-20 in San Jose, and
- the Indigenous Data Sovereignty and Governance Summit in Tucson April 15, and
- Sustainability Conference for Responsible Research Computing, May 4-8.
To keep up with our journey, you can sign up for our newsletter on our home page. Make sure to also follow our LinkedIn page, which functions as our primary social media outlet. We have so much more coming down the pipeline in the next few months, and we hope it’s just the start of not just our journey together, but of a more efficient, transparent, sovereign, and inclusive data economy.
Be well,
The EarthFrame Team - Eric, Keolu, Yousuf, and Juvenson