News

Internship Opportunity: National Research Council of Canada (NRC)

Background

This project’s objectives are to apply data-driven discovery to support efforts that will enhance the production of Canadian protein crops and to devise machine learning methods to mine genomics datasets generated at the National Research Council (NRC). This collaboration at the intersection of AI and the biology of plant proteins will extract meaningful biological information patterns and produce valuable information for future research and application in diverse agricultural fields. Specifically, this project’s primary objective is to develop models that predict molecular phenotypes--including transcript abundance and protein abundance--directly from DNA sequences.

The field of molecular phenotype prediction is in the early stages of leveraging advances in machine learning. Recent advances in genomics and mass spectrometry now allow the generation of datasets of sufficient size and quality for training complex models (e.g. deep neural networks) on biological phenomena such as transcription and translation. However, few studies have been published in this domain and developing expertise in how best to approach this problem is a high priority for this project.

Project Description

In this project, we investigate ortholog contrast models and look to improve our previous work. The existing literature and novel approaches investigated at Amii show potential in both publicly available and NRC datasets. This work can be further improved by tweaking the existing architecture or by adding new information to the existing architecture. Though the NRC data is currently limited to sequence information from plant protein crops, specifically canola and pulses (peas, lentils), with priority given to seed-related traits, additional information such as transcriptomic and chloroplast DNA will be available over time. This additional information will be integrated into the existing model or to a novel architecture to achieve higher performance. One of the major components of this project is the interpretability of the ML models and determining why a certain outcome is predicted. For example, which regions of the sequence are more expressed while predicting the outcome and what additional information is most effective. As a first step, the existing models that have been explored need to be understood well. For this step, the initial results obtained need to be reproduced. Following this demonstration, the focus will shift to ways to improve the model performance and how we can integrate additional information. Possible avenues to explore here would include feature engineering, data expansion and novel architecture.

The outcome of this project will be a first generation of predictive models of molecular phenotype (aka endophenotype) abundance. These models are expected to have wide ranging utility for crop improvement. By providing a means to evaluate trait associations and modeling to a functional level (i.e. transcript, protein or gene/functional dosage), these types of models are expected to drive a paradigm shift in breeding and trait development.

The project will be executed across a two-year time period, and Amii will onboard a number of consecutive interns over the course of the project. The interns will work under the supervision of an Amii Lead Scientist for the duration of their internship. Internships start at 4 months with the possibility of extension of up to 12 months.

Hear from one of our interns, Ruchika:

“I had an amazing experience working as a machine learning intern at Amii. Although I worked on the project for 4 months, I did literature survey, understood the research objectives, reproduced an existing published state-of-the-art method, and developed a new model in this short span of time. Above accomplishments were possible due to Amii's supportive and collaborative environment. Amii has created an inclusive work environment where different perspectives, ideas, and opinions of team members are embraced. I am glad that I got this opportunity to apply my knowledge and skills in machine learning to an NRC funded project at Amii.”

Required Skills / Expertise

We’re looking for a talented and enthusiastic Intern with a strong knowledge of computational biology and machine learning.

Key responsibilities:

  • Build, train, and evaluate ML models
  • Undertake applied research on ML techniques to address the limitations in existing models

Requirements:

  • At least one year into a CS / ML graduate program, MSc. or PhD.
  • Research and/or applied project experience in computational biology and related Deep Learning Technologies (e. g. Convolutional Neural Networks (CNN), Sequence models, Attention networks, Transfer learning)
  • Proficient in Python programming language and related libraries and toolkits (e.g. scikit learn, Pandas, Jupyter notebooks, PyTorch, Keras, Tensorflow, transformers)

Assets:

  • Experience working with data engineering workflows and databases
  • Publication record in peer-reviewed academic conferences or relevant journals in machine intelligence
  • Knowledge and experience in designing experimental frameworks for large datasets

Non-technical requirements:

  • Interdisciplinary team player enthusiastic about working together to achieve excellence
  • Capable of critical and independent thought
  • Able to communicate technical concepts clearly and advise on the application of machine intelligence
  • Intellectual curiosity and the desire to learn new things, techniques, and technologies

How to Apply

If this sounds like the opportunity you've been waiting for, don’t wait to apply! Please send your cover letter and resume through the Indeed listing by April 19, 2022.

Applicants must be legally eligible to work in Canada at the time of application.

Amii is proud to be an equal opportunity employer. We are committed to creating a diverse, inclusive and excellent workforce.

Latest News Articles

Connect with the community

Get involved in Alberta's growing AI ecosystem! Speaker, sponsorship, and letter of support requests welcome.

Explore training and advanced education

Curious about study options under one of our researchers? Want more information on training opportunities?

Harness the potential of artificial intelligence

Let us know about your goals and challenges for AI adoption in your business. Our Investments & Partnerships team will be in touch shortly!