pix2pixhd

Pix2Pix HD Implementation.

See: “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs” [1]

Global Generator + Local Enhancer

[1]High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs: https://arxiv.org/abs/1711.11585

Classes

GlobalGenerator Global Generator from pix2pixHD paper.
LocalEnhancer Local Enhancer module of the Pix2PixHD architecture.
ResNetBlock ResNet Blocks.
class ashpy.models.convolutional.pix2pixhd.GlobalGenerator(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks=9, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]

Bases: ashpy.models.convolutional.interfaces.Conv2DInterface

Global Generator from pix2pixHD paper.

  • G1^F: Convolutional frontend (downsampling)
  • G1^R: ResNet Block
  • G1^B: Convolutional backend (upsampling)
__init__(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks=9, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]

Global Generator from Pix2PixHD.

Parameters:
  • input_res (int) – Input Resolution.
  • min_res (int) – Minimum resolution reached by the downsampling.
  • initial_filters (int) – number of initial filters.
  • filters_cap (int) – maximum number of filters.
  • channels (int) – output channels.
  • normalization_layer (tf.keras.layers.Layer) – normalization layer used by the global generator, can be Instance Norm, Layer Norm, Batch Norm.
  • non_linearity (tf.keras.layers.Layer) – non linearity used in the global generator.
  • num_resnet_blocks (int) – number of resnet blocks.
  • kernel_size_resnet (int) – kernel size used in resnets conv layers.
  • kernel_size_front_back (int) – kernel size used by the convolutional frontend and backend.
  • num_internal_resnet_blocks (int) – number of blocks used by internal resnet.
call(inputs, training=True)[source]

Call of the Pix2Pix HD model.

Parameters:
  • inputs – input tensor(s).
  • training – If True training phase.
Returns:

Tuple – Generated images.

class ashpy.models.convolutional.pix2pixhd.LocalEnhancer(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks_global=9, num_resnet_blocks_local=3, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]

Bases: tensorflow.python.keras.engine.training.Model

Local Enhancer module of the Pix2PixHD architecture.

Example

# instantiate the model
model = LocalEnhancer()

# call the model passing inputs
inputs = tf.ones((1, 512, 512, 3))
output = model(inputs)

# the output shape is
# the same as the input shape
print(output.shape)
(1, 512, 512, 3)
__init__(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks_global=9, num_resnet_blocks_local=3, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]

Build the LocalEnhancer module of the Pix2PixHD architecture.

See High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [2] for more details.

Parameters:
  • input_res (int) – input resolution.
  • min_res (int) – minimum resolution reached by the global generator.
  • initial_filters (int) – number of initial filters.
  • filters_cap (int) – maximum number of filters.
  • channels (int) – number of channels.
  • normalization_layer (tf.keras.layers.Layer) – layer of normalization
  • Instance Normalization or BatchNormalization or LayerNormalization) ((e.g.) –
  • non_linearity (tf.keras.layers.Layer) – non linearity used in Pix2Pix HD.
  • num_resnet_blocks_global (int) – number of residual blocks used in the global generator.
  • num_resnet_blocks_local (int) – number of residual blocks used in the local generator.
  • kernel_size_resnet (int) – kernel size used in resnets.
  • kernel_size_front_back (int) – kernel size used for the front and back convolution.
  • num_internal_resnet_blocks (int) – number of internal blocks of the resnet.
[2]High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs https://arxiv.org/abs/1711.11585
call(inputs, training=False)[source]

Call the LocalEnhancer model.

Parameters:
  • inputs (tf.Tensor) – Input Tensors.
  • training (bool) – Whether it is training phase or not.
Returns:

(tf.Tensor) –

Image of size (input_res, input_res, channels)

as specified in the init call.

class ashpy.models.convolutional.pix2pixhd.ResNetBlock(filters, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, kernel_size=3, num_blocks=2)[source]

Bases: tensorflow.python.keras.engine.training.Model

ResNet Blocks.

The input filters is the same as the output filters.

__init__(filters, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, kernel_size=3, num_blocks=2)[source]

Build the ResNet block composed by num_blocks.

Each block is composed by
  • Conv2D with strides 1 and padding “same”
  • Normalization Layer
  • Non Linearity

The final result is the output of the ResNet + input.

Parameters:
  • filters (int) – initial filters (same as the output filters).
  • normalization_layer (tf.keras.layers.Layer) – layer of normalization used by the residual block.
  • non_linearity (tf.keras.layers.Layer) – non linearity used in the resnet block.
  • kernel_size (int) – kernel size used in the resnet block.
  • num_blocks (int) – number of blocks, each block is composed by conv, normalization and non linearity.
call(inputs, training=False)[source]

Forward pass.

Parameters:
  • inputs – input tensor.
  • training – whether is training or not.
Returns:

A Tensor of the same shape as the inputs. The input passed through num_blocks blocks.