By clicking “Register to Participate” you agree to our Terms and Privacy Policy. We are never going to sell your data or send you spam. Your information is being used for communication purposes only. 



MAY 23, 2020 | 1:00PM EST



The future of cancer care is immunotherapy — using our own body’s immune system to eliminate tumors. While T cells, our immune system's fighter cells, should, in theory, recognize and kill growing tumors, cancer cells send signals to T cells that cause the T cells to malfunction and fail to control tumor growth.

But what if we could modify individual genes in T cells to stop this process — and transform T cells into tumor destroyers? While scientists have made breakthroughs in cancer immunotherapy and T cell engineering in the last two decades, the problem is that there are 20,000 individual gene modifications, or “perturbations,” researchers could make to affect T cell function. Experimentally testing so many perturbations — and combinations of perturbations — in the lab would be too costly and time-consuming. That's why the Eric and Wendy Schmidt Center at the Broad Institute, Harvard Laboratory for Innovation Science, and other partners are holding a data science challenge to bring together the machine learning community to develop algorithms that identify the best genetic changes in T cells to prevent malfunction and enable tumor killing.


Topcoder will host a series of challenges with specific problem statements and acceptance criteria to iterate upon this problem. We will need machine learning specialists to help:

Use a training set of T cells with experimentally characterized perturbations to predict the effects of unseen, held-out perturbations.

Propose the best individual gene perturbations (among all 20,000 possibilities) to prevent T cell malfunction and enable tumor killing.

Propose a quantitative metric for ranking the efficacy of these proposed perturbations.

We will experimentally validate the predictions from question 2, choosing perturbations based on the top-scoring submissions from question 1 and expert discussion. Datasets is hosted by SaturnCloud, which will be available for download. Each participant has the option to use the Saturn Cloud computing environment, which provides 100 free hours of compute per participant, and a Python environment.



Individuals can compete alone or as a group. 

Topcoder members are permitted to form teams for this competition; individuals may also compete without a team. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing this Topcoder Teaming Form 

In order to submit to Challenge 2, teams must have submitted to Challenge 1.


The Cancer Immunotherapy Data Science Grand Challenge is divided into three parts that will be run as individual challenges on Topcoder.You can sign up for the host challenge to receive project-wide updates, but you must still register at the links below to participate in each individual challenge.

Create a Topcoder Profile (or log in to your existing account). 


Register for each of the challenges you’d like to participate in at the links below: 


Predict the effect of unseen, held-out genetic modifications, or "perturbations," to T cells.

Now - January 27: Official challenge registration is open

January 9 - 27: Challenge submission phase


Propose the best individual gene modifications to make T cells more effective for cancer immunotherapy. To participate in Challenge 2, you must make a valid submission to Challenge 1. 

Now - January 27: Official challenge registration is open

January 9 - 27: Challenge submission phase


Propose a “scoring function,” or metric, for ranking the effectiveness of a perturbation. 

Now - January 27: Official challenge registration is open

January 9 - 27: Challenge submission phase

February 4 - 10: Discussion phase (optional)


Am I allowed to use outside data to train my model?

You can apply external resources (e.g., outside transcriptional datasets, genetic ontologies, pretrained embeddings, etc.); however, all external resources must be published or in the public domain and properly credited.

Will I have access to outside servers to train my model? 

We will make a cloud environment available for training your models; you are also allowed to use your own resources.

Will I have to share the complete model for Challenges 1 and 2 or just my predictions?

Yes, you will need to share your code. 

Are teams allowed? 

Yes, you can register as a team with up to four total participants. All team members must sign up through Topcoder and declare their other team members using the Teaming Form.

Do I need to compete on a team? 

No, you can also compete in this challenge as an individual!

Where will the experimental validation occur?

The top-scoring participants in Challenge 1 will have their Challenge 2 submissions tested out in a lab at the Broad Institute.

If I’m a government employee, can I participate in this challenge?

Yes, but government employees are not allowed to receive prize money. Therefore, to participate in this challenge as a government employee, you must relinquish your prize. Individuals or teams wishing to relinquish should fill out the Grand Challenge Team Form and indicate that “I waive all monetary prizes / I relinquish” on the last question of part 1.

Do the cells come from the same mouse? And will the proposed perturbation be experimentally validated on the same mouse?

The cells come from the same mouse strain (C57BL/6 modified to express Cas9), but from different mice with tumors. The tumor model is B16 melanoma modified to express the protein ovalbumin. The cells were then harvested and pooled for the single cell experiment. The proposed perturbations will be validated in the same mouse and tumor strains.