When you need a flexible schema for custom fields, user preferences, or dynamic attributes, PostgreSQL gives you two powerful types: json and jsonb.
At first glance, they seem similar both store JSON data. But under the hood, they're fundamentally different in how they store, process, and query data. Understanding these differences is crucial for building performant applications.
JSON: Text Storage with Validation
The json type is the simpler one. It stores your input exactly as text pretty much like a string that PostgreSQL knows is valid JSON.
How JSON Storage Works
CREATE TABLE events (
id SERIAL PRIMARY KEY,
data JSON
);
INSERT INTO events (data) VALUES
('{"user": "alice", "action": "login", "timestamp": 1699564800}');
When you insert this data, PostgreSQL:
- Validates that it's valid JSON syntax
- Stores it as plain text exactly as you provided it
- Preserves formatting including whitespace and key order
-- This is stored exactly as written
{"user": "alice", "action": "login", "timestamp": 1699564800}
{
"user" : "alice",
"action": "login"
}
The Performance Cost
Every time you query or extract something from JSON, PostgreSQL has to:
SELECT data->>'user' FROM events WHERE id = 1;
What happens internally:
- Read the text from disk
- Parse the entire JSON string
- Navigate to the requested key
- Extract the value
- Return the result
This parsing happens every single time you access the data. For a single query, it's negligible. But for thousands of queries or complex filters, it adds up quickly.
JSONB: Binary Format for Performance
Then came jsonb the binary version of JSON. Instead of keeping raw text, PostgreSQL parses your JSON once, converts it into a structured binary format, and stores it that way.
How JSONB Storage Works
CREATE TABLE events (
id SERIAL PRIMARY KEY,
data JSONB
);
INSERT INTO events (data) VALUES
('{"user": "alice", "action": "login", "timestamp": 1699564800}');
When you insert JSONB data, PostgreSQL:
- Parses the JSON text
- Normalizes the structure (removes whitespace, sorts keys)
- Converts to binary format
- Stores the pre-processed binary tree
TOAST: Handling Large JSON Documents
Now, there's a catch when storing large JSON or JSONB documents. PostgreSQL stores each row in fixed-size pages, typically 8KB. What happens when your JSON document doesn't fit?
The Problem
-- Large JSON document (50KB)
INSERT INTO events (data) VALUES ('{
"user": "alice",
"history": [/* 10,000 events */],
"metadata": {/* large nested object */}
}');
This document exceeds the 8KB page size. PostgreSQL can't store it inline.
TOAST: The Oversized Attribute Storage Technique
TOAST is a system PostgreSQL uses to store large column values that exceed a page's capacity.
How TOAST works:
Small document (<2KB):
┌─────────────────┐
│ Main Table │
│ [id] [data] │ ← Data stored inline
└─────────────────┘
Large document (>2KB):
┌─────────────────┐ ┌──────────────────┐
│ Main Table │ │ TOAST Table │
│ [id] [pointer] │────────→│ [compressed │
└─────────────────┘ │ chunks] │
└──────────────────┘
GIN Indexes: Making JSONB Searchable at Scale
Once your JSONB data is stored efficiently, the next big question is: how do you search inside it efficiently?
The Problem Without Indexes
-- Find all users named "alice"
SELECT * FROM events WHERE data->>'user' = 'alice';
-- Without index: Sequential scan
-- Scans every row, parses JSONB, checks condition
Enter GIN Indexes
GIN (Generalized Inverted Index) is what makes JSONB incredibly fast for queries.
CREATE INDEX idx_events_data ON events USING GIN(data);
How GIN Indexes Work
Instead of indexing entire documents, GIN breaks each JSONB value into its smallest searchable components keys, values, and even nested paths.
Example document:
{
"user": "alice",
"age": 25,
"tags": ["vip", "premium"]
}
GIN index entries:
Key-Value Pairs:
"user" → "alice" → [row_id_1, row_id_5, row_id_12]
"age" → 25 → [row_id_1, row_id_8]
"tags" → "vip" → [row_id_1, row_id_3]
"tags" → "premium"→ [row_id_1, row_id_7]
It's similar to how Elasticsearch indexes documents every searchable element gets its own index entry pointing to rows that contain it.
When to Use GIN Indexes
Use GIN when:
- Read heavy workload (analytics, reporting)
- Complex JSONB queries with filters
- Containment checks are common
- Query performance matters more than write speed
Skip GIN when:
- Write heavy workload (high frequency inserts)
- Simple key extraction only
- Storage space is constrained
- You rarely query inside JSONB
Conclusion
The choice between JSON and JSONB comes down to intent and usage patterns.
JSON is ideal when you only store and retrieve data fast inserts, minimal processing, exact format preservation, and no need for indexing.
JSONB is built for when you query inside your JSON filters, searches, indexes, and performance at scale.
The key insights:
- JSONB's binary format eliminates repeated parsing overhead
- TOAST handles large documents transparently with compression
- GIN indexes make JSONB searchable like a document database
- Write overhead is small compared to read performance gains
- Choose JSONB by default unless you have specific reasons not to
If your JSON is just passive data that you store and retrieve whole, go with JSON. But if you want PostgreSQL to understand and search that data like a structured document, JSONB is the way to go.
PostgreSQL's JSONB turns your relational database into a hybrid system structured where you need it, flexible where you want it, and performant throughout.