Algorithm engineer at a startup, which focus on computer vision field
Developed and implemented deep learning models (LSTM, CNN, etc.) for 3D human pose
estimation from video, resulting in a 25% increase in accuracy compared to the previous model
Implemented machine learning methods for motion recognition, enabling the generation of real-
time 2D feedback for medical and fitness applications
Using LabelMe and Labelimg annotated image data and retrained object detection model to
accomplish specific behavioral analysis of drivers and passengers
Researched and evaluated deep learning models for various prototypes,e.g., applying Vision-
Transformer(ViT) model for object classification, using the Autoencoder model (AE) for key
frame detection, etc.
Researched and implemented Vision-Language Model (VLM) to predict the feedback of the video
from the screenshots and questionnaire answers, improving the automation and accuracy
Collaborated with clients to understand their business needs and translated them into technical
requirements for machine learning models