A Case for SQLite as a File (Exchange) Format

Created 2026/06 – Alfred Reibenschuh – some parts have been rewritten or translated by Copilot and ChatGPT in the english language.

SQLite remains one of the most successful software projects ever created, largely because it solved a problem that many people did not even realize they had: most applications do not actually need a database server.

The traditional database model assumes a separate database process that must be installed, configured, secured, monitored, backed up, and maintained. SQLite challenged that assumption by embedding the database engine directly into the application. No daemon, no network protocol, no service discovery, no authentication layer, and no operational overhead. The database is simply a file.

This "serverless" architecture provides several advantages. Deployment becomes trivial, operational complexity approaches zero, and entire classes of failure disappear. There is no database server to crash, no connection pool to exhaust, no network latency between application and storage, and no administrator required to keep the system healthy. For desktop applications, mobile apps, embedded devices, edge computing, and countless utility programs, SQLite is often the most rational database choice.

Its reliability is equally important. SQLite has spent decades being battle-tested in environments ranging from smartphones and web browsers to industrial equipment, aircraft systems, and spacecraft. The project's famously conservative development philosophy prioritizes correctness and stability over fashionable features. As a result, SQLite has earned a reputation for being remarkably dependable despite its apparent simplicity.

However, SQLite is also beginning to show its age.

The database was designed in an era when a single application writing to a local file represented the dominant use case. While SQLite has evolved significantly over time, its architectural assumptions remain rooted in that model. Its concurrency mechanisms are excellent for local applications but become increasingly strained when developers attempt to use SQLite as a distributed database, a cloud-native backend, or a multi-region service.

This is where newer systems such as Turso and DuckDB enter the picture.

Turso can be viewed as an attempt to modernize SQLite's operational model without abandoning its core strengths. By building distributed replication, edge deployment, and cloud-native capabilities around SQLite-compatible technology, Turso addresses many of the scenarios where traditional SQLite struggles. Developers gain global replication, low-latency edge access, and managed infrastructure while retaining much of the SQLite ecosystem and programming model.

DuckDB approaches the problem from a different direction. Rather than focusing on transactional workloads, it is optimized for analytical processing. DuckDB is often described as "SQLite for analytics," but the comparison only goes so far. Its columnar execution engine, vectorized processing model, and analytical optimizations allow it to process large datasets far more efficiently than SQLite. For data science, reporting, ETL workloads, and ad hoc analysis, DuckDB increasingly occupies territory where SQLite was previously used simply because it was the easiest available option.

As a result, SQLite now faces competition from systems that preserve its most attractive characteristic—the elimination of operational complexity—while specializing in areas where SQLite was never originally designed to excel.

Yet despite these challenges, SQLite continues to thrive because it has quietly evolved into something larger than a database engine.

In many contexts, SQLite functions as a universal data container.

A CSV file is simple, but simplicity comes with severe limitations. CSV lacks data types, constraints, indexes, relationships, metadata, transaction guarantees, and any meaningful schema enforcement. Every CSV consumer must independently infer structure and correctness, leading to endless compatibility issues and data quality problems.

A SQLite database file solves many of these issues while retaining portability. It remains a single file that can be copied, emailed, archived, versioned, downloaded, and exchanged between systems. Unlike CSV, it can preserve rich schemas, enforce constraints, store multiple related tables, and support efficient querying without requiring any external infrastructure.

This makes SQLite an attractive interchange format for scientific datasets, GIS data, software package indexes, offline-first applications, mobile synchronization systems, browser storage, edge computing deployments, and countless internal enterprise workflows. In these scenarios, the database file itself becomes the artifact being exchanged rather than merely the storage layer behind an application.

Viewed through this lens, SQLite's future may not primarily be as a competitor to cloud databases. Instead, it increasingly resembles a standardized, self-describing, queryable data format. Just as PDF became more than a document format and evolved into a universal mechanism for document exchange, SQLite has become more than an embedded database. It is increasingly a portable container for structured data.

That role is difficult to displace.

Turso may surpass SQLite for distributed applications. DuckDB may surpass it for analytical workloads. Future systems will undoubtedly continue to improve on SQLite's architecture in specialized domains. But as a robust, universally readable, transaction-safe, self-contained data package, SQLite occupies a niche that remains extraordinarily valuable.

In that sense, SQLite's greatest achievement may not be that it eliminated the database server. It may be that it transformed the database itself into a file format.

It is true that more sophisticated data formats exist. Apache Arrow provides an efficient in-memory representation for analytical workloads. Apache Parquet offers highly compressed, columnar storage optimized for large-scale data processing. Apache Iceberg adds transactional semantics, schema evolution, and table management capabilities for data lakes operating at petabyte scale.

From a purely technical perspective, these formats are often superior within their respective domains. However, they solve a different problem.

A Parquet file is not directly queryable by itself. An Iceberg table typically assumes a surrounding ecosystem of catalogs, metadata layers, and analytical engines. Arrow is primarily an in-memory interchange format rather than a persistent storage mechanism. In practice, these technologies derive much of their value from the platforms and tooling that surround them.

SQLite occupies a unique position because the format well documented and understood. A SQLite database file is not merely a storage format—it is a complete, self-contained data system. The same file can be copied from one machine to another and immediately queried, modified, indexed, validated, and analyzed using a runtime that is available almost everywhere.

This ubiquity is difficult to overstate. SQLite support exists for virtually every operating system, programming language, CPU architecture, and embedded platform in common use today. More importantly, SQLite is not merely available—it is often already present. It ships as a foundational component of Android, iOS, macOS, Linux distributions, web browsers, countless embedded devices, and an enormous number of software stacks. In many environments, the runtime is effectively infrastructure that developers can assume exists.

That changes the economics of data exchange. A SQLite file can be handed to another team, attached to a ticket, archived for decades, shipped with an application, or downloaded from a public dataset repository with a high degree of confidence that the recipient will be able to inspect and query it immediately. No specialized data platform, metadata catalog, or processing framework is required.

This is where SQLite begins to resemble formats such as PDF, ZIP, or JPEG. Its value is no longer derived solely from technical superiority in any single workload. Rather, its strength comes from the combination of portability, standardization, durability, and near-universal tooling support.

For many real-world data exchange scenarios, that ubiquity is more valuable than raw performance. A format that is slightly less efficient but can be opened everywhere often outcompetes a technically superior format that requires an ecosystem to interpret it.

And don't get me started on CSV and XLSX.

One of the strongest arguments in favor of SQLite as a data exchange format is that it exists in a world where an astonishing amount of critical business data is still exchanged through malformed CSV exports and fragile Excel workbooks.

CSV's greatest strength is also its greatest weakness: there is no real standard. Every organization claims to use "CSV," yet in practice there are dozens of incompatible interpretations. Delimiters may be commas, semicolons, tabs, or something entirely custom. Character encodings range from UTF-8 to legacy Windows code pages. Date formats are often locale-dependent. Decimal separators vary by region. Quoting rules are inconsistently implemented. Null values may be represented as empty strings, the literal text "NULL," zeroes, whitespace, or not represented at all.

Anyone who has worked in enterprise data integration has eventually encountered the absurd situation where a file called "customer_export.csv" can only be successfully imported by a very specific version of a very specific application configured for a very specific regional locale.

The result is that a format intended to be universal often requires an entire layer of ETL logic dedicated solely to figuring out what the producer actually meant.

Excel introduces a different category of problems.

XLSX is technically a well-defined and highly capable format, but in enterprise environments spreadsheets frequently evolve into accidental applications. Business logic migrates into formulas. Hidden sheets become dependencies. Lookup tables are embedded throughout workbooks. Macros implement critical workflows. Data validation rules exist in some tabs but not others. Multiple versions circulate simultaneously through email chains, SharePoint folders, Teams channels, and network shares.

At some point the spreadsheet stops being a document and quietly becomes an undocumented production system.

Many organizations run surprisingly important business processes on workbooks whose original authors left the company years ago. Nobody fully understands how they work, but everyone knows that if a certain worksheet is modified incorrectly, some quarterly report, financial process, or operational dashboard will break.

The irony is that both CSV and Excel persist because they are ubiquitous. Every tool can produce them, and every user can open them.

SQLite inherits much of that same ubiquity while avoiding many of the associated pathologies.

A SQLite database can preserve data types instead of forcing consumers to guess them. It can enforce constraints instead of relying on tribal knowledge. It can contain multiple related tables instead of flattening everything into a single denormalized export. It can provide indexes for efficient access. Most importantly, it can package both the data and its structural definition together in a single artifact.

When someone hands you a SQLite database, you are not receiving just rows of data. You are receiving schema definitions, relationships, constraints, metadata, and query capabilities alongside the data itself.

From an enterprise architecture perspective, that distinction is significant. CSV transfers values. SQLite transfers information.

That is why SQLite occupies such an interesting middle ground. It is nowhere near as sophisticated as modern analytical formats such as Parquet or Iceberg, nor as scalable as distributed cloud databases. Yet it provides substantially more structure, integrity, and self-description than the formats that still underpin a surprising percentage of enterprise data exchange today.

In many situations, SQLite is not competing against Parquet, Iceberg, Snowflake, or BigQuery.

It is competing against "Final_v7_ACTUAL_FINAL_2026_FIXED.xlsx."

And in that comparison, SQLite often looks remarkably modern.

home