Tools & Skills we are looking for:

  • Data Preparation and Data Analysis

            Exploratory Data Analysis

            Distributions

            Visualizations

● Kaggle – Model training
● Computer Vision – OpenCV
● Machine Learning – Sklearn
            Cross validation schemes
            Loss functions
            Overfitting & Underfitting (bias variance tradeoff)
            Experimentation (will have to run 100s of experiments & maintain logs for
study & future reference)
 
● Deep Learning
            Tools
■ Required : Pytorch
■ Good to have : pytorch-lightning, fastai, paddleOCR, MLflow, Wandb
            Computer Vision (CV)
■ Image classification
■ Object detection
■ Segmentation
■ Keypoint(s) detection (both Regression & Heatmap)
            Optical Character Recognition (OCR)
            Tabular Data
■ Feature Engineering
            Natural Language Processing (NLP)
■ TF-iDF, word embeddings, etc
■ Classification
■ Topic Modelling
■ Word and Sentence similarity
■ Name Entity Recognition (NER) & Part of Speech (POS) tagging
 
● Software Engineering & Design patterns
            python best practices
            Logging, Pytest, Automation (tox)
            Rest APIs
            FastAPI
            Git & CI/CD
            GCP
            Databases (Mongodb, Redis, SQLite)
            Cloud
■ SSH
■ Unix command line
            Docker

Responsibilities

  • In this role, you will develop end-to-end computer vision pipelines to derive insights out of raw video and image (face and ID cards) data. Scope of problems includes, but is not limited to, basic Computer Vision (with OpenCV), Image Classification,Object Detection, Segmentation, Keypoint Detection, and OCR. A service can have multiple ML models and 1000+ lines of code. Ability to write production grade code (with logging, unit-tests, profiling, etc.) and knowledge of design patterns to write code that is open for extension ****but closed for modifications.
  • Given a problem statement, develop and present POC to stakeholders. It may include everything: data gathering, data preparation, experimenting with different model architectures and hyper-parameters, and then packaging the model for used as python package, or as REST APIs.
  • Work with cross-functional stakeholders to identify data sources and formalize data collection, and data validation by develop data pipelines to ingest, process, annotate and analyze data that will be used to train ML models.
  • Collaborate with DevOps team to deploy ML models/pipelines/services in production. Knowledge of Git, CI/CD/CM, Docker and Networking would be preferred.
  • Hosting (and maintaining) or develop tools for data annotation, experiment tracking, ML model logging, versioning, and performance monitoring.
  • Proactively look for ways to improve models performance, and quality of the provided data. Provide ways to continuously monitor models performance and data quality,
    and suggest ways to improve both.
  • Look for way to improve MLOps at Zoop.one. This would include things like developing scripts to automate data gathering, labelling & validation, model training, validation and deployment.

Apply for this job