Hi there, I’m Junhua Huang!

Illustration of combining vision and language modalities My name is Junhua, or Harry. I’m currently a master’s student in Electrical and Computer Engineering at UCLA, and I previously completed a dual B.S. in Computer Science and Business at the University of Rochester. My research interstes and experience is mainly related to 3D generation and reconstucture in computer vision area and VLM.

Engineering Experience

Student Researcher, Seizure Lab, ECE Department, University of California, Los Angeles

Los Angeles, CA | Aug 2025 - Oct 2025
During my time at UCLA’s Seizure Lab, I focused on scaling Vision-Language Model (VLM) inference. I successfully deployed three models and boosted throughput by 16.7× (from 0.3 to 5 rps) through the implementation of asynchronous request batching. Additionally, I optimized latency by introducing a caching layer, achieving a prefix-cache hit rate of approximately 73%. Beyond the technical infrastructure, I also built a comprehensive medical visual dataset containing 10,000 records with 8 features focused on epilepsy seizures from the UCLA medical school.

Student Researcher, Professor Chenliang Xu’s Lab, University of Rochester

Rochester, NY | May 2025 - Sep 2025
Working in Professor Chenliang Xu’s lab, I conducted research on improving VLM-guided audio remixing, drawing inspiration from traditional movie composition techniques. This research led to achieving State-of-the-Art (SOTA) results with approximately 9% fewer parameters by utilizing a simplified architecture and prompt-variant sweeps. I am the first author of the resulting paper submitted to IEEE ICASSP 2026. On the engineering side, I built an asynchronous, sharded caption-embedding pipeline that boosted throughput by 16× and increased KV-cache hits from 6% to 57% via chunking and prefix-caching. I also prototyped a gated fusion and multi-head attention module.

Student Researcher, Professor Jiebo Luo’s Lab, University of Rochester

Rochester, NY | Aug 2024 - Apr 2025
In Professor Jiebo Luo’s lab, my research focused on reconstructing animal images from skeletons using Stable Diffusion (I2I). I applied partial diffusion to retain segmentation and utilized sparse-cloud encoders to inject spatial cues, effectively bridging long domain gaps. Furthermore, I improved facial realism in generated images via Reinforcement Learning from Human Feedback (RLHF) using Direct Preference Optimization (DPO) with LoRA fine-tuning. This involved building preference pairs and fine-tuning prompts and schedules on the University of Rochester’s Linux clusters.

farsee internship photo

Machine Learning Engineer Intern, Farsee2 Technology Co., Ltd, Hybrid/Wuhan, China

July 2024 – September 2024
During my tenure at Farsee2 Technology, under the mentorship of Dr. Hailong Pan, I participated in an industry-level digital reconstruction project utilizing Gaussian sampling techniques. My role involved designing and testing a novel method to reduce floating components in digital reconstruction models by restricting point generation based on the angle between camera directions and the object’s surface normal vectors. This approach effectively minimized unwanted artifacts in the models. Additionally, I contributed to modifying the parallel depth generation process, transforming it from linear computation to methods capable of handling different distributions. This enhancement significantly improved computational efficiency across multiple machines.

Student Researcher, Goergen Institute for Data Science, University of Rochester, Rochester, NY

January 2024 – Auguest 2024
As a researcher at the Goergen Institute for Data Science, I focused on the challenging issues of generative fairness and bias in large language models (LLMs), particularly within the context of political ideology. My work involved developing methods to calculate and evaluate political ideologies to conduct blind testing on LLMs. I automated the data collection process using Python and APIs and participate to the maintainess of a comprehensive database containing over 140,000 entries of political ideology sentenses.

I am proud to introduce our research findings in the paper “Using Generative AI to Calculate Party Positions: A Comparison of Human Experts and Large Language Models,” which is now available on SSRN here.

Research Assistant, Professor Chengliang Xu’s Lab, University of Rochester

Rochester, NY | Apr 2024 - Aug 2024 Contributed to a new computer vision model for Temporal Action Localization task by modifying the model backbone into Mamba structure and adjusting the new dataloader of the model. Mentored by PhD candidate Pinxin Liu, I deployed, tested, and improved various machine learning tasks for gesture generation research, successfully running and testing related paper codes on Linux servers. I also modified the ViT (Transformer++) structure and built, cleaned, and labeled 200 sets of audio-gesture datasets.

Research Assistant, University of Rochester, Rochester, NY (Discover Grant $5,000)

May 2022 – December 2022
In collaboration with Dr. Zhu, Dr. Jarvis, and other team members, I contributed to the digital reconstruction of the Jamestown Settlement and Elmina Castle using low-cost, image-based 3D Virtual Reality techniques. My responsibilities included disseminating high-resolution models into low-space-cost models for use in virtual reality environments. I utilized Agisoft Metashape and Unity to create detailed 3D models, conducting screening, cleaning, and restoration of raw image data for virtual reality applications. Additionally, I used Adobe Photoshop to repair overlapping virtual camera angles, manually cleaned interfering point clouds, and corrected shadows, lighting, and surface materials.

Projects

OpenJDK 17u-dev

[TODO, make link, to the repo. and the documentation] July 2024 – August 2024
For the OpenJDK 17u-dev project, I authored a comprehensive step-by-step installation guide on GitHub for deploying JDK on Linux local testing environments. I also contributed to the testing and acceptance of pull request #2756(8313674) in Ubuntu 22, which helped in improving the stability and performance of the JDK.

IMDB Database Project

August 2023 – December 2023
Leading a team of three, I ensured the database compliance and enhanced security of an IMDB Database project. My role involved focusing on testing, compatibility assessment, and debugging to prevent injection attacks. I also implemented a login system and customized functions for CRUD operations based on user type using JavaScript, PHP, MySQL, HTML, Git, and CSS. Additionally, I redesigned the website’s UI and incorporated advanced database security features.

Body Signals Analysis Project

September 2023 – December 2023
In the Body Signals Analysis Project, I preprocessed over 900,000 records by selecting relevant attributes through feature engineering and reducing dimensionality using Principal Component Analysis (PCA). I developed NLP machine learning models with 72-81% accuracy for predicting the impact of smoking and drinking on body signals. My work also included visualizing data trends using Seaborn and Matplotlib, providing a clearer understanding of behavioral patterns related to smoking and drinking.

Bob’s Burgers App Project

January 2023 – May 2023
I solo-developed an Android app named Bob’s Burgers, demonstrating proficiency in key technologies such as RecyclerView, API integration, and MVVM architecture. My focus was on effective end-to-end app development, including robust error handling for invalid API requests, ensuring a seamless user experience. Additionally, I designed a user-friendly interface with careful attention to color aesthetics and smooth animations, significantly enhancing user engagement and accessibility.

Skills

Technical Skills: Java, Kotlin, Python, C, R, SQL, HTML, CSS, Linux, JavaScript, PHP, MySQL, Git
Software: Android Studio, IntelliJ IDEA, Google Suite, Microsoft Office, SolidWorks, Metashape

Interests

Cycling, Fishing, Scuba Diving (PADI Advanced Open Water Diver), Fitness, Snowboarding