ASAsense Binary Table (ABT) format: verschil tussen versies
Uit ASAsense Documentation
(Removed UnixTimestamp datatype) |
Geen bewerkingssamenvatting |
||
| Regel 111: | Regel 111: | ||
|float | |float | ||
|interpret as floating point data (only possible for 4 or 8 byte fields) | |interpret as floating point data (only possible for 4 or 8 byte fields) | ||
|- | |- | ||
|bool | |bool | ||
Versie van 5 okt 2024 07:31
The ASAsense binary table format is an alternative for the CSV format, but for binary data with fixed row lengths. The structure is as follows:
| name | length (bytes) | type | allowed values | explanation |
|---|---|---|---|---|
| file_type | 1 | u8 | 1 | The internal number of the file type and version |
| n_cols | 4 | u32 | 1-u32::max | the amount of headers |
| col_widths | 4 x n_cols | u32[n_cols] | 1-u32::max per column | n_cols u32 integers describing the byte-width of the columns |
| metadata_length | 4 | u32 | 0-u32::max | the length of the metadata field (can be 0 is it needs to be skipped) |
| metadata | metadata_length | char[] | string of fixed length (so no '\0' character to mark the end of the string) | JSON-encoded metadata |
| data | variable, until EOF | byte[] | any | The actual data |
The JSON-metadata should be an object with the following structure
| field | required | type | explanation |
|---|---|---|---|
| n_rows | false | number | The amount of rows in the file, if known upfront (can be omitted for streaming data) |
| comment | false | string | a free string |
| columns | true | a list of column definition objects | an list of objects describing the columns in more details (should have n_cols entries) |
| extra | false | JSON object | a JSON object that can freely be specified |
A "column definition object" should have the following format
| field | required | type | explanation |
|---|---|---|---|
| name | false | string | an optional name of the column |
| comment | false | string | a free string |
| datatype | true | string | An identifier for the datatype of the column |
The datatype identifiers can be anything, but the following should at least be supported by the reader:
| identifier | description |
|---|---|
| utf8 | interpret the data as an UTF8-string (with fixed length, does not need to be terminated with '\0') |
| int | interpret the data as signed integer |
| uint | interpret the data as unsigned integer |
| float | interpret as floating point data (only possible for 4 or 8 byte fields) |
| bool | boolean value (0 = false, anything else = true) |
Other, unknown datatype identifiers, should be treated as opaque binary data (bytes) by the reader.