Media generation input format for large vision model.
image
object (Image)
The image bytes or Cloud Storage URI to make the prediction on. It is required for editing. Not needed for generation. This field will be used to determine whether the call is editing or generation.
prompt
string
The text prompt for generating the images. This is required for both editing and generation.
mask
object (Mask)
Masked field will be editied based on the text content provided. This can be either an image or a polygon. It should not be provided without images. Optional field for editing the images.
referenceImages[]
object (ReferenceImage)
The reference images to be used for editing and customization capabilities. Imagen 3 Capability adds support for multiple reference images, each of which can be a mask, control, style, or subject image. Depending on the reference type, the reference_config field will be populated with the corresponding config.
| JSON representation |
|---|
{ "image": { object ( |
Image
mimeType
string
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png
data
Union type
data can be only one of the following:bytesBase64Encoded
string
Base64 encoded bytes string representing the image.
gcsUri
string
| JSON representation |
|---|
{ "mimeType": string, // data "bytesBase64Encoded": string, "gcsUri": string // Union type } |
Mask
data
Union type
| JSON representation |
|---|
{ // data "image": { object ( |
BoundingPolyList
polygons[]
object (BoundingPoly)
| JSON representation |
|---|
{
"polygons": [
{
object ( |
ReferenceImage
A ReferenceImage is an image that is used to provide additional context for the image generation or editing.
referenceImage
object (Image)
The actual image data of the reference image.
referenceId
integer
The id of the reference image. This must be unique within the request.
referenceType
enum (ReferenceType)
The type of the reference image.
reference_config
Union type
reference_config can be only one of the following:maskImageConfig
object (MaskImageConfig)
A config for a mask image.
controlImageConfig
object (ControlImageConfig)
A config for a control image.
styleImageConfig
object (StyleImageConfig)
A config for a style image.
subjectImageConfig
object (SubjectImageConfig)
A config for a subject image.
| JSON representation |
|---|
{ "referenceImage": { object ( |
MaskImageConfig
Config for masked image editing using Imagen 3 Capability
maskMode
enum (MaskMode)
Mode used to generate the mask if mask is not provided.
dilation
number
Dilation to be used with this Mask. This value is used to dilate the mask before applying the edit mode.
maskClasses[]
integer
The segmentation classes which are used in the MASK_MODE_SEMANTIC mode.
| JSON representation |
|---|
{
"maskMode": enum ( |
ControlImageConfig
Config for control image used for editing.
controlType
enum (ControlType)
type of control image.
enableControlImageComputation
boolean
Whether to compute the control image for the request.
superpixelRegionSize
integer
Region size of the superpixel control image.
superpixelRuler
number
Ruler of the superpixel control image.
| JSON representation |
|---|
{
"controlType": enum ( |
StyleImageConfig
Config for style image used for editing.
styleDescription
string
description of the style image.
| JSON representation |
|---|
{ "styleDescription": string } |
SubjectImageConfig
Config for subject image used for editing.
subjectDescription
string
description of the subject image.
subjectType
enum (SubjectType)
type of subject image.
| JSON representation |
|---|
{
"subjectDescription": string,
"subjectType": enum ( |