If you are interested in doing a research project (“semester project”) or a master’s project at IVRL, you can do this through the Master Programs in Communication Systems or in Computer Science. Note you must be accredited to EPFL. This page lists available semester/master’s projects for Fall 2024 semester.
For any other type of applications (research assistantship, internship, etc), please check this page.
-
Tutorial on diffusion models: https://cvpr2022-tutorial-diffusion-models.github.io/
-
Hugging Face Diffusion Models Course: https://huggingface.co/learn/diffusion-course/unit0/1
-
A dataset of 3D objects (not 3D bricks): https://objaverse.allenai.org/
-
A database of 3D bricks designs/models: https://www.ldraw.org/article/593.html
-
Dreamfusion paper, presenting a method (SDS loss) to leverage existing 2D diffusion models for 3D models generation: https://dreamfusion3d.github.io/
-
ControlNet paper, presenting a method to add new conditioning signals (eg, sketch, depth) to an existing diffusion model: https://arxiv.org/abs/2302.05543
-
Some existing 3D bricks design software: https://www.melkert.net/LDCad, https://www.bricklink.com/v3/studio/main.page
-
How 2D reference image is currently used in tools: https://studiohelp.bricklink.com/hc/en-us/articles/15341721435799-Reference-Image [note it is just a guiding help, it is not automatically generating a 3D brick design]
-
-
Automatic mosaic and sculpture tools: https://studiohelp.bricklink.com/hc/en-us/articles/6508264220183-Sculpture, https://studiohelp.bricklink.com/hc/en-us/articles/5625025298327-Mosaic
-
An example of file format to describe 3D bricks models/designs: https://en.wikipedia.org/wiki/LDraw#Example_File:_pyramid.ldr,_a_Lego_Model_of_a_Pyramid
Description:
While film photography was almost completely replaced by digital photography in the beginning of the century, it is nonetheless slowly growing in popularity, because of its distinctive “look”. However, most users are not familiar with how a film camera works, and film stock prices are skyrocketing. For this reason, film emulators are very popular, since they can take as input a digital image and produce an approximate simulation of what a film photograph of the same scene would look like. However, different film stocks have different physical (or chemical) responses to light, because of the difference in sensitivity(ISO), the distribution of the silver halide grains, the color filters used to produce color images, the presence or not of halation filters etc… In order to mimic these properties, simulators use generic sliders which can be tuned to approach a plausible look. However these preset profiles are not based on any physical properties of the film stock itself.
In this project, our goal is to create a physically-based simulator, based on experimental measurements, for one or more film stocks. The project will therefore imply data acquisition and analysis, with film cameras and different film stocks, as well as a precise modeling of the film response.
Previsional project steps:
1- Data acquisition: The project’s first step will be to acquire film + digital images for well set scenarii. We will use the IVRL lab which offers many possibilities to acquire scenes with different lighting conditions. The acquisition of the images will depend on the analysis. It will also include developing and scanning the films appropriately. We will also have to establish a protocol to correctly acquire the images.
2 – Data Analysis: After having acquired the data, we will have to analyze the results for various properties:
- Grain: grain is one of the most important visual aspects of film photography. the goal will be to analyze its distribution focusing on multiple aspects
- the shape of the grains,
- the mean and std of the size
- the density of grain
- the correlation between grain properties and signal intensity
For some of these, we will require the use of a microscope to correctly identify the grains properties. We will first do this study on gray level film, which will be simpler than color film which is a superimposition of 3 photosensitive layers covered by color filters.
Tone mapping: tone mapping is a classic step of the image signal processing pipeline, which can be changed by the users. However, for film photography the contrast response function is directly linked with the physical properties of the film stock itself. We can correctly analyze the tone mapping by using standard image targets.
Color profile: Similar to tone mapping, color profile is also dependent on the film stock. A variety of methods already exist to transfer color responses from one sensor to the other, with polynomial models fitted with least squares on RAW pairs. More modern approaches involve deep learning. Since our goal is to first tackle black and white film this is a side lead for this project. [5,6]
3 – Modeling:
For grain, different models exist which are more or less physically based. Our goal will be to include the statistical and morphological analysis results in an already existing model to better mimic grain generation. [1] [2] Proposes to model the grain rendering using the boolean model while [3] approximates it using additive white noise. We could also explore learning based approaches such as [4]
Supervision:
The supervision of this project will be done by Raphael Achddou and Gwilherm Lesné from Telecom Paris, for his expertise on grain simulation.
Prerequisites:
Python and PyTorch, image and signal processing basis.
Type of Work:
MS semester project.
80% research, 20% development
Contact:
[email protected] [email protected]
References:
[1] Newson, Alasdair et al. “A Stochastic Film Grain Model for Resolution‐Independent Rendering.” Computer Graphics Forum 36 (2017): n. pag.
[2] B. E. Bayer, “Relation Between Granularity and Density for a Random-Dot Model,” J. Opt. Soc. Am. 54, 1485-1490 (1964)
[3] Zhang, Kaixuan et al. “Film Grain Rendering and Parameter Estimation.” ACM Transactions on Graphics (TOG) 42 (2023): 1 – 14.
[4] Ameur, Zoubida et al. “Deep-Based Film Grain Removal and Synthesis.” IEEE Transactions on Image Processing 32 (2022): 5046-5059.
[5] Afifi, M., Abuolaim, A.: Semi-supervised raw-to-raw mapping. CoRR abs/2106.13883 (2021), https://arxiv.org/abs/2106.13883
[6] Rang, N.H.M., Prasad, D.K., Brown, M.S.: Raw-to-raw: Mapping between image sensor color responses. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. pp. 3398–3405 (2014). https://doi.org/10.1109/CVPR.2014.434 3,
Startup company Innoview Sàrl has developed software to recover a message hidden into patterns. Appropriate settings of parameters enable the detection of counterfeits. The goal of the project is to define optimal parameters for different sets of printing conditions (resolution, type of paper, type printing device, complexity of hidden watermark, etc..). The project involves tests on a large data set and appropriate statistics.
Deliverables: Report and running prototype (Android, Matlab).
Prerequisites:
– knowledge of image processing / computer vision
– basic coding skills in Matlab and/or Java Android
Level: BS or MS semester project
Supervisors:
Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44
Prof. Roger D. Hersch, BC110, [email protected], cell: 077 406 27 09
This project aims to explore whether there is any semantic information encoded by off-the-shelf diffusion model that helps us and other deep learning models understand what is the content of an image or the relationship between images.
Diffusion models [1] have been the new paradigm for generative modeling in computer vision. Despite its success, it remains to be a black box during generation. At each step, it provides a direction, namely the score, towards the data distribution. As shown in recent work [2], the score can be decomposed into different meaningful components. The first research question is: does the score encode any semantic information of the generated image?
Moreover, there is evidence that the representation learned by diffusion models is helpful to discriminative models. For example, it can boost the classification performance by knowledge distillation [3]. Furthermore, diffusion model itself can be used as a robust classifier [4]. It can be seen that discriminative information can be extracted from the diffusion model. Then the second question is: What is the information about? Is it about the object shape? Location? Texture? Or other kinds of information.
This is an exploratory project. We will try to interpret the black box in diffusion model and dig semantic information that it encodes. Together, we will also brainstorm the application of diffusion model other than image generation. This project can be a good chance for you to develop interest and skills in scientific research.
References:
[1] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J]. Advances in neural information processing systems, 2020, 33: 6840-6851.
[2] Alldieck T, Kolotouros N, Sminchisescu C. Score Distillation Sampling with Learned Manifold Corrective[J]. arXiv preprint arXiv:2401.05293, 2024.
[3] Yang X, Wang X. Diffusion model as representation learner[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 18938-18949.
[4] Chen H, Dong Y, Shao S, et al. Your diffusion model is secretly a certifiably robust classifier[J]. arXiv preprint arXiv:2402.02316, 2024.
Deliverables: Deliverables should include code, well cleaned up and easily reproducible, as well as a written report, explaining the models, the steps taken for the project and the results.
Prerequisites: Python and PyTorch. Basic understanding of diffusion models.
Level: MS research project
Number of students: 1
Contact: Yitao Xu, [email protected]