Skip navigation links

Package co.cask.cdap.api.data.schema

This package contains beta classes for schema in CDAP.

See: Description

Package co.cask.cdap.api.data.schema Description

This package contains beta classes for schema in CDAP. APIs are experimental and are subject to change in future releases.

Schema definition

The schema definition is adopted from Avro Schema, with the following modifications:
  1. Supports any type as map key, not just string.
  2. No "name" property for enum type.
  3. No support of "doc" and "aliases" in record and enum types.
  4. No support of "doc" and "default" in record field.
  5. Dropped the "fixed" type.

Primitive types

Schema Description
"null" or {"type":"null"} No value
"boolean" or {"type":"boolean"} Boolean (true or false) value
"int" or {"type":"int"} Signed 32-bit integer
"long" or {"type":"long"} Signed 64-bit integer
"float" or {"type":"float"} Single precision IEEE-754 floating point number
"double" or {"type":"double"} Double precision IEEE-754 floating point number
"string" or {"type":"string"} Unicode character sequence
"bytes" or {"type":"bytes"} Sequence of octets

Complex types

Type Schema Description
Enum {"type":"enum","symbols":["SUCCESS","FAILURE]} List of string symbols
Array {"type":"array","items":<schema of array item type>} List of items of the same type, with the item schema defined in the "items" property
Map {"type":"map","keys":<schema of key type>,"values":<schema of value type>} Map from the same key type to the same value type
Record {"type":"record", "name":<record name>, "fields":[
{"name":<field name>,"type":<schema of the record field>},
...
]}

or
"<record name>"
Record that contains list of fields. The "name" property defines name of the record, which could be used to define recursive data structure (such as linked list). The "fields" property is used for defining field name and schema for each field in the record.
Union [<schema of first type>,<schema of second type>,...] Represents an union of schemas

Schema Compatibility

Writer type Compatible reader types
null null
boolean boolean, string
int int, long, float, double, string
long long, float, double, string
float float, double, string
double double, string
string string
bytes bytes
array array with compatible items schema.
map map with compatible keys and values schemas.
record record with compatible schemas over common fields, matched by field name.
It is allowed to have missing writer record fields in the reader schema and vice versa.

For union types

Is writer union? Is reader union? Compatible requirements
yes yes At least one pair of schema between writer and reader unions must be compatible.
yes no At least one schema in the writer union must be compatible with the reader schema.
no yes Writer schema must be compatible with at least one schema in the reader union.
Skip navigation links

Copyright © 2018 Cask Data, Inc.. All rights reserved.