How A High-School Dropout Transformed Into a Amazing Data Scientist

Meet Harpreet, a self-made AI enthusiast who’s journeyed from high school dropout to lead data scientist.

Despite a rocky academic start, he has navigated his way through the world of AI, with a career spanning actuarial science, biostatistics, and data science.

His passion for AI and deep learning has led him to work at notable companies like Comet, Pachyderm, and Deci.

He shares his unique learning approach, his experiences with AI, and his work developing innovative open-source models. Dive into his story to learn more about his AI products, his tech stack, and how his experience with the BuildFast Academy has shaped his AI development journey.

Can you tell us a bit about yourself and your journey so far?

Even though I was an honor student in high school with a good GPA and enrolled in AP classes, I found myself in a lot of trouble during my senior year. Consequently, my top-choice universities withdrew their admissions because of my actions. After a few wasted years, I got my act together and went back to school, eventually graduating with a Bachelors in Economics (with a 2.06 GPA) from Cal State Fullerton.

I taught high school math for a while before deciding to pursue graduate school for math, stats, and a career in actuarial sciences. However, my grades were poor, and in order to get into grad school, I had to start from scratch. I embarked on a journey that took me from a community college studying trigonometry through to Calculus 3, and then to UC Davis as part of a continuing education program where I took more advanced math, statistics, and probability courses. This entire process took two years, and a lot of hard work. I finished about 40 units in math and stats with a 3.8 GPA and completed the math portion of the GRE in the 92nd percentile.

I was accepted into my graduate school of choice, Illinois State University, where the author of several actuarial prep books taught. After two more years of intense study, I started working as an actuary for a year and a half, then as a biostatistician for about five years, before moving into data science in 2018. I started as a senior and eventually became a lead data scientist.

My interest in deep learning began in late 2021, around the same time I switched my career to developer relations. I’ve worked at companies like Comet, Pachyderm, and now Deci.

  1. Was there a specific event, project, or person that inspired your interest in AI?

I am mostly self-taught in the basics of AI. I have had some great mentors and tutors via YouTube and books. My favorite books for learning deep learning are “Deep Learning: A Visual Approach” and “Deep Learning Illustrated.” However, I suggest a top-down learning approach where you start with using a high-level training library like SuperGradients and a pre-trained model for inference. Then train on a custom dataset. Skip all the math at the beginning and just build something. If you’re really into it, you’ll delve deeper into it.

  1. Your previous experience with AI?
    My experience level with AI is somewhere between an experienced beginner and mid-level proficiency in deep learning. I was proficient with statistics and classical machine learning, and I had gained some experience with computer vision. Before joining the Build Fast Course, I had some experience with AI. I was familiar with PyTorch, training computer vision models, etc. I just needed a structured approach to learning langchain without having to spend too much time figuring out a roadmap myself.
  2. How do you spend your time outside of work and learning?

I have two small kids: a 3.5-year-old boy and an 8-month-old girl. Outside of work, I spend as much time with them as I can. But other than that, AI is my life. I love learning about it, practicing my skills, and building with it. Currently, I’m part of an LLMOps bootcamp run by the guys at ML Maker Space.

  1. Could you share some details about the AI product you’ve developed?

At Deci AI, we build models. In May 2023, we released a new object detection model called YOLO-NAS. It’s an open-source model, but the pre-trained weights have some restrictions for commercial use. However, you can use the architecture to train your own model from scratch.

In August 2023, we released an open-source code generation model, DeciCoder. This one is released under Apache 2.0, so no restrictions there. We also have an open-source training library called SuperGradients. It’s PyTorch-based and offers a lot of training tricks that are easy to implement. It also simplifies Knowledge Distillation, Quantization (PTQ/QAT), and using DDP.

  1. What enhancements or new features are you planning to add?

For SuperGradients, I’m in the process of collecting feedback from the community. We want the library to be as user-friendly as possible, while still allowing for flexibility and customization. We recently made it easy to export a model to ONNX and we’re expanding our model zoo. We recently added our version of DEKR to the model zoo, and soon we plan to add YOLO-NAS for segmentation and pose.

  1. Can you talk about your tech stack?

My tech stack includes PyTorch, SuperGradients, HuggingFace, Streamlit, and Google Collab. I work a lot out of Collab since my role involves a lot of teaching andinstruction. I find Collab to be very flexible and use it so frequently that I subscribe to Collab+ Pro.

  1. How has your experience with the BuildFast Academy contributed to the development of your AI product?

The course has been an excellent introduction to Langchain and its common patterns. I enjoyed Peter’s use of mindmaps as they helped organize my own thinking. I plan on using some of the template projects from the course to connect to DeciCoder and build a personal coding assistant.