Serialization
Skir defines a standard for serializing and deserializing data types to JSON and binary. The generated data classes implement this standard to ensure that data structures defined in your schema can be encoded and decoded consistently across all languages.
Serialization formats
When serializing a data structure, you can choose one of 3 formats:
| Format | Persistable | Space efficiency | Readability |
|---|---|---|---|
| Dense JSON | Yes (safe) | High | Low |
| Readable JSON | No (unsafe) | Low | High |
| Binary | Yes (safe) | Very High | None |
Dense JSON
This is the format you should choose in most cases. It is compact and safe for persistence — you can freely rename fields in your schema without breaking compatibility with existing data.
Structs are serialized as JSON arrays, where the field numbers in the index definition match the indexes in the array. Constant variants of enums are serialized as numbers, wrapper variants are serialized as [number, value] arrays.
struct User {
user_id: int32;
removed;
name: string;
rest_day: Weekday;
subscription_status: SubscriptionStatus;
pets: [Pet];
nickname: string;
}
const JOHN_DOE: User = {
user_id: 400,
name: "John Doe",
rest_day: "SUNDAY",
subscription_status: {
kind: "premium_since",
value: "2027-01-01:00:00:00Z",
},
pets: [
{ name: "Fluffy" },
{ name: "Fido" },
],
nickname = "",
}The dense JSON representation of JOHN_DOE is:
[400,0,"John Doe",7,[2,1798761600000],[["Fluffy"],["Fido"]]]Removed fields are replaced with zeros. Trailing fields with default values (nickname in this example) are omitted.
The output is compact but not human-friendly — if you query a column storing dense JSON directly with a SELECT, what comes back is a terse array of numbers and values with no field names in sight. If you ever need to inspect a value during debugging, a tool that can come in very handy is the Converter web app, which can translate any dense JSON value into readable JSON instantly.
Encoding rules
| Type | Encoded as | Examples |
|---|---|---|
| bool | 1 for true, 0 for false | 1 |
| int32 | A JSON number | 1234 |
int64 hash64 |
| 1234 "9007199254740992" |
float32 float64 |
| 1.23 "Infinity" |
| timestamp | A JSON number representing milliseconds since the Unix epoch | 1672531200000 |
| string | A JSON string | "Hello" |
| bytes | A Base64 string | "SGVsbG8=" |
| T? | null if the value is missing, otherwise the serialized value. | null 123 |
| [T] | A JSON array | [1, 2, 3] |
| struct | A JSON array. The array index corresponds to the field number. Removed fields are represented as 0. Trailing default values are omitted. | [400, 0, "John"] |
| enum |
| 1 [2, "value"] |
Readable JSON
This format is intended for debugging and human inspection. Structs are serialized as JSON objects and enum constants as strings, making the output easy to read. However, it is not safe for persistence: because Skir allows fields to be renamed, schema evolution will silently break compatibility with old readable JSON data.
The readable JSON representation of JOHN_DOE is:
{
"user_id": 400,
"name": "John Doe",
"rest_day": "SUNDAY",
"subscription_status": {
"kind": "premium_since",
"value": {
"unix_millis": 1798761600000,
"formatted": "2027-01-01:00:00:00Z"
}
},
"pets": [
{
"name": "Fluffy"
},
{
"name": "Fido"
}
]
}Encoding rules
| Type | Encoded as | Examples |
|---|---|---|
| bool | true or false | true |
| int32 | A JSON number | 1234 |
int64 hash64 |
| 1234 "9007199254740992" |
float32 float64 |
| 1.23 "Infinity" |
| timestamp | An object with unix_millis and formatted fields | { "unix_millis": 1672531200000, "formatted": "2023-01-01T00:00:00Z" } |
| string | A JSON string | "Hello" |
| bytes | The string "hex:" followed by the hexadecimal representation | "hex:48656c6c6f" |
| T? | null if the value is missing, otherwise the serialized value. | null 123 |
| [T] | A JSON array | [1, 2, 3] |
| struct | A JSON object containing field names and values. Default values are omitted. | { "name": "John", "age": 30 } |
| enum |
| "RED" { "kind": "rgb", "value": "ff0000" } |
Binary format
This format is a bit more compact than JSON, and serialization/deserialization can be faster in languages like C++. Only prefer this format over JSON when the small performance gain is likely to matter, which should be rare.
Encoding rules
All numeric values are encoded using little-endian byte order.
| Type | Encoded as | Examples |
|---|---|---|
| bool | 1 for true, 0 for false | 0x01 0x00 |
| int32 |
| 10 -> 0x0a 255 -> 0xe8 0xff 0x00 -1 -> 0xeb 0xff |
| int64 |
| |
| hash64 |
| |
| float32 |
| 0.0 -> 0x00 1.5 -> 0xf0 00 00 c0 3f |
| float64 |
| 0.0 -> 0x00 |
| timestamp |
| |
| string |
| "Hi" -> 0xf3 0x02 0x48 0x69 |
| bytes |
| |
| T? |
| null -> 0xff val -> val_bytes |
| [T] |
| [1, 2] -> 0xf8 ... ... |
| struct | Same encoding as an array. The array index corresponds to the field number. Removed fields are represented as 0. Trailing default values are omitted. | |
| enum |
|
Deserialization
JSON flavors
When Skir deserializes JSON, it knows how to handle both dense and readable flavor. You do not need to specify which flavor is being used.
Handling of zeros
Both the dense JSON and binary formats use zeros to represent removed fields to save space. To preserve forward compatibility, zero is treated as a valid input for any type, even non-numerical ones.
With the exception of optional types (T?), all types will decode a zero value (integer 0) as the default value for that type. For example, a string decodes 0 as "", and an array decodes 0 as []. For optional types, 0 is decoded as the default value of the underlying type (e.g. string? decodes 0 as "", not null).
Converter web app
Skir provides a hosted converter at skir.build/converter to convert values across dense JSON, readable JSON, and binary. You can also reach it at any time by clicking the button in the header of this website. All processing happens locally in your browser — no data ever leaves your machine.
Provide a schema
The converter needs a schema, which you can provide in two ways:
- A type descriptor JSON from generated code. In Python, for example, you can get it from
User.serializer.type_descriptor.as_json_code(). The syntax is similar in other languages.
A common pattern is to store the type descriptor JSON as metadata next to your serialized data, so it is always at hand when you need to inspect a value. - A GitHub URL pointing to a specific line where a record is defined in a
.skirfile, for examplehttps://github.com/gepheum/skir-fantasy-game-example/blob/v1.0.0/skir-src/fantasy_game.skir#L123.
Paste a value
Once the schema is loaded, paste the value you want to inspect. The converter accepts dense JSON, readable JSON, and binary (base16 or base64) — it detects the format automatically. It then shows the value converted to all three formats.