Mar: The Modern Archive Format
Core data infrastructure for data of all sizes.
Mar is EarthFrame's archive format and data interaction toolkit. Just as data is the new oil, you can think of mar as the new tar. Mar improves on traditional archive formats like TAR and ZIP by adding efficient random access, modern compression algorithms, built-in checksums, good defaults with user customization, and more.
Archiving Your Data
Mar, much like TAR and ZIP, is an archive format and command-line tool for interacting with that format. Mar uses a well-defined binary format that includes a header and one or more data blocks. Mar is often faster than TAR, and Mar archives are often the same size or smaller than tarballs compressed with GZIP or LZ4.
Create archives of your files with a simple command:
mar create <archive name> <one or more files or directories>
Compression and Space Savings
Mar supports modern compression algorithms including ZSTD and the highly-optimized libdeflate library for GZIP compression. Mar archives are usually the same size or smaller but much faster to create and compress.
Mar archives enable listing files without unpacking any data blocks, meaning you do not need to decompress your files to disk to see what's inside them.
mar list <archive>
Sharing Your Data
Mar makes it easy to share large collections of files as a single archive. Recipients can extract all files, get individual files by their path, or even pipe data directly from the archive into other processes.
mar create my_archive.mar big_data/ little_data/
Extract everything:
mar extract my_archive.mar
Or extract specific files:
mar extract my_archive.mar big_data/big_data_1.pq big_data/big_data_104.pq
Data Redaction
Mar supports data redaction within the archive, enabling data removal without having to fully extract and re-archive. Perfect for compliance and privacy workflows.
mar redact -o redacted_archive.mar my_archive.mar big_data/big_data_4.pq
Universal Accessibility
Mar archives store decompression information right in the archive itself, so recipients don't have to guess or trust your file extensions. By creating a strict specification, Mar makes it easy to reliably share your data.
- Cross-platform – Supports Linux and Mac OS X
- Multi-architecture – Tested on x86 and ARM systems
- Easy installation – Coming to package managers
Reducing Cloud Storage Costs
Mar compresses your data and makes random access efficient. Storing compressed archives is more efficient than storing uncompressed data, often by a factor of 2-4x.
With the mar-s3 package (currently in private beta), you can selectively list files in your remote archive and download just the ones you need, significantly reducing egress costs. mar-s3 even includes a caching layer for efficient repeated downloads.
Indexing and Searching
Mar's indexing system includes support for sidecar files that associate new indices with the filename index in the header. This enables implementing new index types without needing to update the core specification.
Today, Mar supports lexicographic similarity search using the MinHash sidecar index, enabling retrieval of similar texts from an archive based on a query. Full semantic search using a Mar vector sidecar index and the mar-embed package is expected in the April release.
Built for Agents
Mar's self-describing CLI makes it easy for agents and LLMs to figure out how to use it without needing a specialized MCP server or fine-tuning. Mar was designed from day one to be agent-driven and works great with modern tool-calling LLMs right out of the box—and it's only getting better.
The Future of Data Sovereignty
At EarthFrame, Mar already powers our internal data sovereignty toolkit, stores versioned releases of our documentation, and allows us to easily share files and data with each other. We're building an ecosystem of tools around Mar to make storing, archiving, and sharing your data easier, faster, and cheaper.
We are just getting started on Mar's development with an expected 1.0 release in late 2026.
Get Started with Mar
Mar is available as open-source on GitHub under an Apache-2.0 License. Explore the full source code, documentation, and contribute to the project:
Installation and detailed usage examples are available in the Mar GitHub repository.