University of California, Merced '23 - BS Computer Science & Engineering
Over the course of an 8-week
micro-internship
Micro-internships via Parker Dewey are short-term, paid, professional projects that are open to all college students and recent graduates of U.S.-based institutions.
https://www.parkerdewey.com/career-launchers
with
Hypothetic,
Enabling a new generation of 3D creators through the power of generative AI and machine learning.
https://www.hypothetic.art/
facilitated by
OpenAvenues,
OpenAvenues connects international and U.S. students with industry experts in projects that are designed to built experiences and portfolios.
https://www.openavenuesfoundation.org/
I had the opportunity to work closely with generative AI models. My task was to leverage the capabilities of
ControlNet,
Adds conditional controlling to text-to-image diffusion models.
https://huggingface.co/blog/controlnet
a cutting-edge generative AI model, to automatically texture 3D objects.
The primary tool I used was
PyTorch3D,
A library for deep learning with 3D data.
https://pytorch3d.org/
a highly versatile library from
Facebook AI Research
Meta research to understand and develop AI systems by advancing the longer-term academic problems surrounding the field.
https://ai.meta.com/research/
(FAIR) that provides a set of utilities and models for 3D computer vision research. PyTorch3D played a crucial role in handling the 3D data and applying the textures generated by ControlNet.
ControlNet itself is a part of the
Diffusers library
Library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules.
https://huggingface.co/docs/diffusers/index
of generative models that can be found on
HuggingFace,
AI company and community that develops ML tools and hosts thousands of AI models.
https://huggingface.co/
an AI company and community that develops ML tools and hosts thousands of AI models. These Diffusers models are designed to generate high-quality, realistic textures by learning from a large dataset of examples. ControlNet is unique for its ability to use multiple input images and ControlNet models in a single generation pipeline. Additionally, I used
OpenCV,
An open source computer vision and machine learning software library built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception.
https://opencv.org/
a popular library for computer vision tasks, for extra image processing and manipulation, especially in the creation of masks.
The overall project goal was about understanding and applying these ML and 3D concepts from bleeding-edge research to gain experience and knowledge. In the course of the project, I found it useful to dissect and analyze the
TEXTure paper:
Research paper that presents novel concepts for fully texturing 3D objects using AI as well as transfering textures from images.
https://texturepaper.github.io/
research that presented novel concepts for texturing 3D objects using AI, which I adapted and incorporated into my work with ControlNet.
The final version of the project is documented in a Python Notebook hosted on Google's CoLab
here.
The finalized project notebook.
https://colab.research.google.com/