Contact us for help?

Contact with us through our representative or submit a business inquiry online.

Contact Us

Data Scientist

Data Scientist with Bachelor’s Degree in Computer Science, Computer Information Systems, Information Technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.

Job Duties and Responsibilities:

  • Define end-to-end machine learning pipeline for large scale technology products and deep technical products in distributed processing, real-time and scalable systems.
  • Develop solutions to business problems using the data science life cycle.
  • Develop and maintain data analytics solutions and machine learning algorithms.
  • Build and leverage new and existing tools for Natural Language Processing (NLP), and intelligent document processing tasks.
  • Design and Develop Spark applications in Python for streaming multi-modal data like text, images, videos for distributed machine learning training.
  • Design and Develop AWS Cloud deployment scripts using AWS Cloud Formation Templates, Terraform for deploying data and ML pipelines.
  • Develop a Proof of Concept for multiple intents to demonstrate conversational flow, responses from an embedded document, and generative AI (Chat GPT).
  • Fine tune applications and systems for high performance and higher volume throughput and Pre-Process using AWS Stack for data pre-processing.
  • Translate Load and exhibit unrelated data sets in various formats and sources like AVRO, Parquet, JSON, Text files, Kafka queues and Log Data.
  • Develop and implement Generative AI models, with a strong understanding of techniques such as GPT, T5, Stable Diffusion and BERT.
  • Drive excellent management skills to deliver complex projects, including effort/time estimation, to build detailed work breakdown structure (WBS), to manage critical path, and to use PM tools and Platforms.
  • Build Scalable Client engagement level processes for faster turnaround and higher accuracy.
  • Run regular Project reviews and Audits to ensure that projects are being executed within the guardrails agreed by all Stakeholders.
  • Manage the Client Stakeholders and their expectations with a regular cadence of weekly meetings and status updates.

Technologies/Environment involved :

  • Distributed storage: AWS Cloud Storage (S3), Azure HD Insight, Google Cloud (GCP)
  • Database management: Mongo DB, Cassandra, Postgres, Oracle, MS SQL Server, Redshift
  • Graph Processing: Neo4J
  • Machine learning: Spark Machine Learning Library (MLlib), TensorFlow, Keras, Pytorch
  • Data processing: Spark, Hadoop MapReduce, Kafka and Storm, Airflow, Spark-streaming
  • Programming Languages: Java, Scala, Python [REST Framework], PySpark
  • DevOps Tools: BitBucket, Git, Apache Maven, Selenium, Jenkins, Docker

Work location is Portland, ME with required travel to client locations throughout USA.

Rite Pros is an equal opportunity employer (EOE).

Please Mail Resumes to:
Rite Pros, Inc.
565 Congress St, Suite # 305
Portland, ME 04101.