On a recent client project, my software development team faced a mismanaged legacy NoSQL database. This database contained hundreds of thousands of re

JSONoid Discovery: Simplify Schema Discovery

submited by
Style Pass
2024-04-24 13:30:04

On a recent client project, my software development team faced a mismanaged legacy NoSQL database. This database contained hundreds of thousands of records across a couple of dozen containers multiplied by four environments. We got ourselves out of trouble with JSONoid Discovery.

The biggest issue was the lack of schema enforcement or documentation throughout the life of the database. This meant every query result needed to be painstakingly validated at runtime.

Step one of migrating to a more reasonable data storage solution was understanding the current state of the data. For the first container that we tackled, we tried a manual process. We researched data access patterns, interviewed developers, and sampled the existing data. This was a difficult process that didn’t yield satisfactory results, and we still regularly found ourselves chasing edge cases.

If we were going to solve this problem in a timely way, we needed an automated solution that removes the guesswork. Enter: JSONoid Discovery. JSONoid is a tool created by Rochester Institute of Technology’s Data Unity Lab directed by Dr. Michael Mior. It uses some clever monoids to perform schema discovery on a collection of JSON documents. The code is open-sourced under an MIT license and helpfully distributed as a Docker container.

Leave a Comment