I was playing around with ollama API to explore the API capabilities and noticed the HTTP response was streaming JSON that prompted me to look into the response headers.
The content type is application/x-ndjson and quick search hinted it’s a new line separated JSON that can be used in streaming protocols. Also the Transfer-Encodingis chunked and fits well with for LLM responses over the wire.
While researching further on JSON streaming there are several other approaches to stream JSON objects. Notable ones are ndjson, jsonl, json-seq. All these formats are useful for processing and parallelising large JSON objects without loading entire dataset into the memory.
ndjson: Uses a newline character (\n) to separate each JSON object, and no whitespace is allowed between objects or values. Example: {"some":"thing\n"}. Only single \n
jsonl (JSON Lines): Similar to ndjson, but allows for optional whitespace around the \n separator and \r\n in windows. Example: {"name": "John", "age": 30}\r\n