Muhammad Naeem

Photo 

Muhammad Naeem
Graduate Student in LIESMARS
Wuhan University, Hubei, China
Advisor: Prof. Xiongwu Xiao

Email: m.naeem4288@gmail.com
LinkedIn: linkedin.com/in/muhammadnaeem27
Portfolio: muhammadnaeem27.github.io


Researcher with experience in deep learning, computer vision, and image generation spanning both industry and academia. Currently I'm pursuing my Master's in Photogrammetry and Remote Sensing from Wuhan University - China. I am working under the supervision of Prof. Xiongwu Xiao in the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS).

Previously, I worked as a Senior AI Developer and Team Lead at Octaloop Technologies, leading the Computer Vision Team in projects like Virtual Room Staging, Virtual Try-on, and Video Game Streaming Highlights Generation, ensuring seamless AI integration and mentoring team members.

I did my B.Sc in Computer Science (major in AI) from COMSATS University, Islamabad, Pakistan, where I worked under Dr. Jamal Hussain Shah.


Education

  • COMSATS University Islamabad, Pakistan
    Bachelor of Science in Computer Science
    September 2019 – June 2023
    Relevant Coursework: Data Structures and Algorithm, Digital Image Processing, Machine Learning, Computer Vision, Pattern Recognition

Experience

  • Octaloop Technologies
    Senior AI Developer / Team Lead, Computer Vision Team
    April 2024 – Aug 2025
    • Lead and oversee the Computer Vision Team, developing advanced solutions for projects such as Virtual Room Staging, Virtual Try-on, and Video Game Streaming Highlights Generation using scene analysis.
    • Supervise and mentor team members, ensuring seamless integration of AI functionalities into client applications and fostering a collaborative environment to drive innovation.
    • Stay updated with the latest advancements in AI and computer vision, applying cutting-edge techniques to enhance project performance and client satisfaction.
    • Tools: Python, PyTorch, TensorFlow, YOLO-v5, OpenCV, Keras, Fast API, AWS, GCP
  • NineSol Technologies
    AI Developer
    April 2023 – April 2024
    • Developed and integrated the latest models into the backend of mobile and web applications, enhancing functionality with features such as Background Removal, Colorization, Document Scanning, Virtual Try-on, and Transcription.
    • Stayed informed with the latest industry trends and emerging technologies, applying this knowledge to refine and advance features within diverse applications.
    • Collaborated with other team members to smoothly embed advanced functionalities, ensuring seamless user experiences in mobile applications.
    • Tools: Python, PyTorch, TensorFlow, YOLO-v5, OpenCV, Keras, Fast API, AWS, GCP

Projects

Project GIF Virtual Staging
Python, PyTorch, Stable Diffusion, ControlNet, OpenCV, NumPy, Scikit-Learn
  • Implemented an inpainting approach using Stable Diffusion with ControlNet to furnish empty room images without altering existing items and color combinations.
  • Applied semantic segmentation on empty rooms to identify areas and items to preserve, generating binary images from segmented masks.
  • Used tailored prompts for bedrooms and living rooms, passing binary masked images, prompts, and empty room images to the Stable Diffusion inpainting model to generate furniture in the specified areas.
Project GIF Video-games Streaming Highlight Generation
Python, PyTorch, OpenCV, NLTK, Scikit-Learn
  • Developed a system to generate interesting highlights from 6-7 hours of video game streaming videos using three main approaches: Sound, Transcription, and Movement.
  • Implemented separate AI techniques for each approach to identify key timestamps, leveraging audio analysis, natural language processing, and computer vision.
  • Combined results from all three approaches in a specific sequence to produce the best highlights and generate a final video.
Project GIF Weed Detection Robotic Car
YOLO-v5, TensorFlow, Flask, Tkinter, VS code, Colab, Roboflow, labelimg
  • Compiled a diverse dataset comprising four categories of weed. Implemented YOLO-v5 for real-time object detection, achieving precise identification of weeds in agricultural fields.
  • Designed and built a robotic car prototype that uses a smartphone camera and DC motors to wirelessly stream live images to the YOLO-v5 model via FastAPI.
  • Integrated an Arduino-controlled LED system with a matrix-based system to precisely locate and eliminate detected weeds.
  • Created a website to showcase project insights, allowing users to engage with the trained YOLO-v5 model and evaluate its effectiveness.
  • Used data visualization to highlight the project's technological innovation and potential for sustainable agricultural practices.
Project GIF Portrait Background Remover
Python, PyTorch, U-Net, NumPy, Scikit-Learn, OpenCV
  • Used U-Net structures for both rough and detailed portrait segmentation, addressing the problem of subject isolation in images.
  • Utilized internet and publicly available datasets like COCO for training data.
Project GIF Colorize Gray scale Image
Python, PyTorch, U-Net, Fast API, Colab, VS code
  • Included two time-scale update rules, self-attention GANs, and inflection points for effective and error-free coloring.
  • Used U-Net architecture with model-specific backbone choices (resnet34/resnet101), spectral normalization, and self-attention.
  • Implemented Perceptual Loss based on VGG16 during NoGAN learning for realistic colorization.
Project GIF Image to Talking Portrait
Python, Dlib, face-recognition, PyTorch, ffmpeg, OpenCV, Fast API, Colab, VS code
  • Created talking videos by combining lips and head movements of a reference video with a source image using Dlib for facial landmark identification.
  • Used Wav2Lip-hq for precise lip movement synchronization and ESRGAN for video up-sampling.
  • Integrated synchronized audio and used ffmpeg and OpenCV for efficient video processing.
Project GIF Document Scanner
Python, OpenCV, PyTorch, Fast API, Colab, VS code
  • Created a Document Scanner with a Geometric Unwarping Transformer using Doc3D dataset for accurate geometric unwrapping.
  • Used DocProj dataset for Illumination Correction Transformer to resolve illumination problems during document scanning.
Project GIF Sky Changer
Python, PyTorch, UNET, Fast API, Colab, VS code
  • Developed a model with sky-changing (Multiband blender), high-res processing (UNET), and low-res processing (hrnet-ocr).
  • Enhanced segmentation in the low-res module using ASPP and improved high-res module with learnable parameters.
  • Used Multiband blender for effective sky replacement.
Project GIF Text-to Image Generator
Python, PyTorch, Hugging Face, Stable Diffusion, Fast API, Colab, VS code
  • Implemented Stable Diffusion for text-to-image generation, using fine-tuning and Dreambooth for high-quality, customized images.
Project GIF Face Swap
Python, Dlib, PyTorch, face-recognition, TensorFlow, Fast API, Colab, VS code
  • Implemented pose-matching and facial point detection for accurate face swapping using Dlib.
  • Blended faces to ensure natural skin color transitions.

Achievements

  • Awarded a fully funded scholarship for my Master's at the prestigious LIESMARS Lab, Wuhan University—renowned globally for its pioneering research in photogrammetry, remote sensing, and geospatial information science.
  • Ranked 2nd with the ‘Weed Detection System’ in BS Computer Science Final Year Project.
  • Won 2nd place at the 2023 Hackathon with the ‘Weed Detection System’.