2

My data is a rich text format, stored as nested JSON arrays. Text tokens store the plaintext of the string and annotations describing the formatting. I want to map the specific structure of these nested JSON arrays to a rich Kotlin class hierarchy at decode time.

Here's the typescript type describing this text encoding:

// Text string is an array of tokens
type Text = Array<TextToken>
// Each token is a Array[2] tuple. The first element is the plaintext.
// The second element is an array of annotations that format the text.
type TextToken = [string, Array<Annotation>]
// My question is about how to serialize/deserialize the Annotation type
// to a sealed class hierarchy.
//
// Annotations are an array where the first element is always a type discriminator string
// Each annotation type may have more elements, depending on the annotation type.
type Annotation =
 | ["b"] // Text with this annotation is bold
 | ["i"] // Text with this annotation is italic
 | ["@", number] // User mention
 | ["r", { timestamp: string, reminder: string }] // Reminder

I have defined some Kotlin classes to represent the same thing using sealed class. This is the output format I want to have after deserializing the JSON:

// As JSON example: [["hello ", []], ["Jake", [["b"], ["@", 1245]]]]
data class TextValue(val tokens: List<TextToken>)

// As JSON example: ["hello ", []]
// As JSON example: ["Jake", [["b"], ["@", 1245]]]
data class TextToken(val plaintext: String, val annotations: List<Annotation>)

sealed class Annotation {
  // As JSON example: ["b"]
  @SerialName("b")
  object Bold : Annotation()

  // As JSON example: ["i"]
  @SerialName("i")
  object Italic : Annotation()

  // As JSON example: ["@", 452534]
  @SerialName("@")
  data class Mention(val userId: Int)

  // As JSON example: ["r", { "timestamp": "12:45pm", "reminder": "Walk dog" }]
  @SerialName("r")
  data class Reminder(val value: ReminderValue)
}

How do I define my serializers? I tried to define a serializer using JsonTransformingSerializer, but I get a null pointer exception when I attempt to wrap the default serializer for one of my classes:

@Serializable(with = TextValueSerializer::class)
data class TextValue(val tokens: List<TextToken>)

object TextValueSerializer : JsonTransformingSerializer<TextValue>(TextValue.serializer()) {
    override fun transformDeserialize(element: JsonElement): JsonElement {
        return JsonObject(mapOf("tokens" to element))
    }

    override fun transformSerialize(element: JsonElement): JsonElement {
        return (element as JsonObject)["tokens"]!!
    }
}
Caused by: java.lang.NullPointerException: Parameter specified as non-null is null: method kotlinx.serialization.json.JsonTransformingSerializer.<init>, parameter tSerializer
    at kotlinx.serialization.json.JsonTransformingSerializer.<init>(JsonTransformingSerializer.kt)
    at example.TextValueSerializer.<init>(TextValue.kt:17)

1 Answer 1

4

The error you're getting seems to be because you're referencing the TextValue serializer in the TextValue serializer.

Because the data structure doesn't quite match the key:value pairings that the serializer expects it's harder to have it do something like this automatically.

For your current implementation here's what you'll need, starting from the bottom up:

  1. Annotation

    Create a custom serializer that converts the JsonArray representation to its Annotation representation. This is done by simple mapping the indicies of the JsonArray to its corresponding sealed class representation. Since the first index is always the descriminator, we can use that to inform the type we're trying to map to.

    Where possible, we can defer to the auto-generated serialziers.

    []          -> Annotation.None
    ["b"]       -> Annotation.Bold
    ["@", 1245] -> Annotation.Mention
    ...
    

    To do this you can create a new serializer and attach it to the Annotation class (@Serializable(with = AnnotationSerializer::class)).

    object AnnotationSerializer : KSerializer<Annotation> {
        override val descriptor: SerialDescriptor = buildClassSerialDescriptor("Annotation") {}
    
        override fun serialize(encoder: Encoder, value: Annotation) {
            val jsonEncoder = encoder as JsonEncoder
    
            // Encode the Annotation as a json element by first converting the annotation
            // to a JsonElement
            jsonEncoder.encodeJsonElement(buildJsonArray {
                when (value) {
                    is TextAnnotation.None -> {}
                    is TextAnnotation.Bold -> { add("b") }
                    is TextAnnotation.Italic -> { add("i") }
                    is TextAnnotation.Mention -> {
                        add("@")
                        add(value.userId)
                    }
                    is TextAnnotation.Reminder -> {
                        add("r")
                        add(jsonEncoder.json.encodeToJsonElement(ReminderValue.serializer(), value.value))
                    }
                }
            })
    
        }
    
        override fun deserialize(decoder: Decoder): Annotation {
            val jsonDecoder = (decoder as JsonDecoder)
            val list = jsonDecoder.decodeJsonElement().jsonArray
    
            if (list.isEmpty()) {
                return Annotation.None
            }
    
            return when (list[0].jsonPrimitive.content) {
                "b" -> Annotation.Bold
                "i" -> Annotation.Italic
                "@" -> Annotation.Mention(list[1].jsonPrimitive.int)
                "r" -> Annotation.Reminder(jsonDecoder.json.decodeFromJsonElement(ReminderValue.serializer(), list[1].jsonObject))
                else -> throw error("Invalid annotation discriminator")
            }
        }
    }
    
    @Serializable(with = AnnotationValueSerializer::class)
    sealed class TextAnnotation {
    
  2. TextToken

    The TextToken follows the same strategy. We first extract the token at the first index, and then build the annotations using the second index. As above, we will need to annotate the TextToken class to use the following serialzier:

    object TextTokenSerializer : KSerializer<TextToken> {
        override val descriptor: SerialDescriptor = buildClassSerialDescriptor("TextToken") {}
    
        override fun serialize(encoder: Encoder, value: TextToken) {
            val jsonDecoder = encoder as JsonEncoder
            jsonDecoder.encodeJsonElement(buildJsonArray {
                add(value.plaintext)
                add(buildJsonArray {
                    value.annotations.map {
                        add(jsonDecoder.json.encodeToJsonElement(it))
                    }
                })
            })
        }
    
        override fun deserialize(decoder: Decoder): TextToken {
            val jsonDecoder = decoder as JsonDecoder
            val element = jsonDecoder.decodeJsonElement().jsonArray
    
            // Token
            val plaintext = element[0].jsonPrimitive.content
    
            // Iterate over the annotations
            val annotations = element[1].jsonArray.map {
                jsonDecoder.json.decodeFromJsonElement<TextAnnotation>(it.jsonArray)
            }
    
            return TextToken(plaintext, annotations)
        }
    }
    

    It might be better to return the following JSON:

    { plaintext: "Jake", annotations: [["b"], ["@", 1245]] } which will map better to the TextToken POJO and will remove the need to have the serializer.

  3. TextValue

    The final peice of the puzzle is the TextValue object, which effectively wraps the list of TextTokens. It might be better to use a type alias for this to the following:

    typealias TextValue = List<TextToken>
    

    In the current model, you can use a serializer that parses the JsonArray into the List<TextToken> and then wraps that list in the TextValue object.

    object TextValueSerializer : KSerializer<TextValue> {
        override val descriptor: SerialDescriptor = buildClassSerialDescriptor("TextValue") {}
    
        override fun serialize(encoder: Encoder, value: TextValue) {
            val jsonEncoder = (encoder as JsonEncoder)
            jsonEncoder.encodeSerializableValue(ListSerializer(TextToken.serializer()), value.tokens)
        }
    
        override fun deserialize(decoder: Decoder): TextValue {
            val jsonDecoder = decoder as JsonDecoder
            val list = jsonDecoder.decodeJsonElement().jsonArray
    
            return TextValue(list.map { jsonDecoder.json.decodeFromJsonElement(it.jsonArray) })
        }
    }
    
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, very detailed! I think I got blocked by val descriptor -- I couldn't figure out how to make a perfect one. Is it just used for type uniqueness in this case?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.