Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GAN to calculate style loss #441

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

citymonkeymao
Copy link

Learn from multiple styles with GAN

When using hundreds of pictures as style images, a discriminator could be used to calculate the style loss. The discriminator takes gram matrix as input and was trained to tell whether the generated image belongs to the target style.

The traditional way of calculating sytle loss:

The new way of calculating style loss:

Results

Imitate Shinkai Makoto Style

Transfered with ~160 high quality style images.

Imitate Monet(Comparing to CycleGAN)

Imitate Vangogh(Comparing to CycleGAN)

Usage

  1. Download style image set(borrowed from CycleGAN):
    bash ./datasets/download_dataset.sh <dataset name>

    <dataset name> could be monet2photo, vangogh2photo, ukiyoe2photo, cezanne2photo

  2. Do style transfer

th neural_style.lua -style_image `./list_images.sh <style_image_dir>` -content_<content_image>  -gan -content_weight 2 -style_weight 50000 -image_size 256 -backend cudnn -num_iterations 10000 -d_learning_rate 0.000001`

-gancommand specifies using Discriminators to calculate style losses. d_learning_rate is the learning rate for Discriminators. list_images.sh helps to list all images in one directory, all files in that directory should not contain space and style_image_dirshould not contain~. You need to play with parameters for different style and size.

example

Transfer fj.jpg to vangogh style

  1. Download vangogh's painting bash ./datasets/download_dataset.sh vangogh2photo
  2. Add styles to image
th neural_style.lua -style_image `./list_images.sh datasets/vangogh2photo/trainA
` -content_image data/fj.jpg  -gan -content_weight 1 -style_weight 50000 -image_size 256 -backend cudnn -num_iterations
10000 -d_learning_rate 0.0000001

@Naruto-Sasuke
Copy link

Hi, it's interesting. what are related papers of your code. I wanna take a look.

@citymonkeymao
Copy link
Author

I didn't see any paper describing this yet. However, you can look at this paper that proposed GAN. Here I use GAN to capture the distribution of gram matrices.

@Naruto-Sasuke
Copy link

I think it's better to calculate correct prediction percent of D for fakes and reals.

@citymonkeymao
Copy link
Author

The correct rates of discriminators fluctuate except the 4th style layer. the discriminator can always success on this layer. I guess this is because this layer is also used as the content layer.

@ProGamerGov
Copy link

So what the limitations of your idea here? Is the image size limited to 256 like it is in CycleGAN? Do we need a thousand image dataset and pretrained CycleGAN model for each artists' style?

@citymonkeymao
Copy link
Author

Like the original style transfer, you can decide image size by using -image_size option.The only limitation for this is your memory.

I'm not sure the lower limit of the images needed, the Shin style demonstrated there used 170 images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants