Bad data models are the most expensive form of technical debt. These are the mistakes we keep seeing in client codebases.
We've inherited a lot of codebases over the years. The one thing that's hardest to fix is always the data model. Bad code can be refactored. Bad architecture can be migrated. But a bad data model touches every query, every feature, and every report - and changing it usually means migrating production data without downtime.
Storing structured data as JSON blobs
JSON columns are useful for genuinely unstructured data - user preferences, third-party API responses, metadata that varies by record. But when we see an "attributes" JSON column that every query has to parse, filter, or join against, that's a schema design that someone was avoiding thinking through.
The rule is simple: if you query by a field, it should be a column. If you filter, sort, or aggregate by it, it should be a column with an index. JSON columns are for data you store and return as-is, not data you query against.
Missing indexes on foreign keys
This is the most common performance issue we see. Every foreign key should have an index. Every column you filter or sort by regularly should have an index. PostgreSQL does NOT automatically create indexes on foreign key columns (MySQL InnoDB does). Adding indexes after the fact is easy, but most teams don't think about it until the queries are already slow and the table has 10 million rows.
Run EXPLAIN ANALYZE on your slowest queries. If you see sequential scans on large tables, you're probably missing an index. But don't index everything either - each index costs disk space and slows down writes. Index the columns you query by, not every column in the table.
Soft deletes everywhere
Adding a deleted_at column to every table sounds safe, but it means every query needs a WHERE deleted_at IS NULL clause. Miss one and you're showing deleted data to users. Unique constraints get complicated because you might want to allow a new record with the same email as a deleted one. Counts are wrong unless you remember to filter.
Use soft deletes where you genuinely need them: audit trails, compliance requirements, or data that users need to "undo delete." Use real deletes everywhere else. If you need to archive data for analytics, copy it to an archive table before deleting.
Normalization: the right amount
Under-normalized databases have the same data in multiple places. Update one copy but miss the other, and your data is inconsistent. Over-normalized databases require 8-table joins for basic queries and make your application code painful to write.
Third normal form (3NF) is a good default for most applications. Denormalize intentionally when you need read performance - add a computed column, a materialized view, or a cache table. But do it as an optimization after measuring, not as a design choice upfront.
Migrations
If your schema changes aren't version-controlled in migration files, you're gambling. Every change to the database structure should be a migration that can be run forward and rolled back. Use Knex, Prisma, Sequelize, or your framework's built-in migration tool.
- Never modify a migration that's already been run in production. Write a new one
- Test migrations on a copy of production data before running them for real
- Make migrations backwards-compatible where possible. Add the new column, deploy the code that uses it, then remove the old column
- Large data migrations (backfilling millions of rows) should run in batches, not in a single transaction
Other patterns to avoid
- Storing timestamps in the application's local timezone. Store everything in UTC, convert on display
- Integer IDs exposed in URLs. Use UUIDs for anything public-facing. Sequential IDs leak information about your total record count and make it easy to enumerate
- Storing monetary values as floats. Floats have rounding errors. Use integers (cents) or the DECIMAL/NUMERIC type
- Tables with 50+ columns. If your users table has columns for address, preferences, billing, and profile info, those should probably be separate tables
- No created_at or updated_at columns. Add these to every table. You'll need them for debugging, sorting, and audit trails

Ben Arledge
CEO & CTO, CloudOwlHave a project in mind?
No sales pitch, just an honest conversation about what you're building.
