First disease-specific (breast cancer) protein library opens new drug paths

BOSTON, MA--(February 8, 2006) In research that could significantly advance the pace of drug discovery in the fight against breast cancer, Harvard Medical School investigators announce in today's online Journal of Proteome Research that they have created the first publicly available library of reliably expressible proteins of a human disease, in this case for breast cancer.

Perhaps more significantly, these researchers expressed a subset of the 1,300 protein-expressing complementary DNAs in the library into a model system mimicking cells of a human breast, allowing them to study on a broad scale how these proteins might contribute to the development of breast cancer. Through this comprehensive approach, they identified potentially novel functional activities for both well known and lesser-known breast cancer-associated proteins.

"The process of carcinogenesis is complex and involves the activation of many different cellular programs," says Joan Brugge, PhD, Chair, HMS Department of Cell Biology, and co-principal investigator of this initiative, called Breast Cancer 1000. "A significant limitation for breast cancer research has been the inability to distinguish whether certain proteins that are altered in breast tumor cells are the cause or the effect of conversion of normal breast cells to malignancy. The systematic approach that we've enabled and demonstrated will allow researchers to track cancer-causing proteins in simulated environments, with the goal of learning how to impede them."

"The availability of this collection will enable pilot experimentation and accelerate the development of faster techniques for studying breast cancer in a mammalian setting," says Joshua LaBaer, MD, PhD, director of the Harvard Institute of Proteomics (a division of Harvard Medical School), and also co-principal investigator. "To advance breast cancer research quickly, we are making the BC1000 library publicly available. It can be viewed from the Harvard Institute of Proteomics website (

"Drug design teams in the pharmaceutical industry traditionally have not used proteomics approaches to screen for potential targets, primarily because systematic proteomic tools are in their infancy," said Steven Carr, PhD, who was not part of this research team, and who leads the Proteomics group at the Broad Institute (of Harvard University and Massachusetts Institute of Technology). "While this work is highly in-vitro and needs further validation, the tools and approaches demonstrated in this study show a potentially valuable screening tool for drug companies, primarily as a means to triage for novel targets to design drugs around," said Carr, who prior to joining the Broad was director of Computational and Structural Sciences at SmithKline Pharmaceuticals and (now GlaxoSmithKline) and led protein science and proteomics groups at Millennium Pharmaceuticals. "This study helps lay the groundwork for new and refined proteomics tools for cancer and other diseases."

The American Cancer Society estimated that 211,240 new cases of invasive breast cancer would be diagnosed among women in 2005, as well as an estimated 58,490 additional cases of non-invasive (in situ) breast cancer. The ACS also estimated that approximately 40,410 women would die from breast cancer last year. Only lung cancer accounts for more cancer deaths in women.


The Breast Cancer 1000 library is a collection of complementary DNA (cDNA) associated with breast cancer. Complementary DNA is generated from mRNA, which is produced by genes and contains the instructions on how to produce proteins. However, mRNAs are unstable outside of a cell, and therefore scientists convert it to cDNA for further long-term use. Researchers in BC1000 created a sequence-validated collection of roughly 1,300 breast cancer-related cDNAs ranging from well-studied breast cancer-causing genes to less conspicuous breast cancer-associated cDNAs.

Selection of the cDNAs for inclusion in the BC1000 library was a multipart effort. The first 200 genes were suggested by Boston area experts in breast cancer research. Another 50 genes were shown to be overexpressed in ductal carcinoma, one form of breast cancer. The remainder were identified by MedGene, a literature-mining software application developed by the Harvard Institute of Proteomics that searches all titles and abstracts in the Medline database to identify cDNAs co-cited with a particular disease and utilizes statistical methods to rank the relative strengths of these gene-disease relationships based on the frequency of total citation and co-citation.

"The work to isolate, sequence, and validate the BC1000 cDNAs was an immense undertaking, with multiple parties involved," says LaBaer. "While the library covers a broad spectrum of breast cancer-related genes, it is not all inclusive," says LaBaer. "The addition of new genes to this collection, including genes more recently linked to breast cancer and genes more difficult to clone, is an ongoing effort.


To assess the range and functionality of the cDNAs in the library, the investigators introduced the first 265 constructed cDNAs into a line of immortalized breast epithelial cells and subjected these cells to a single screen to examine their relationships to cell migration, proliferation and morphogenesis. From this screen, the researchers identified cDNAs already known to play roles in each, validating this approach as a means to identify relevant cDNAs. They also received hits from less-studied breast cancer genes, demonstrating the capability of using unbiased functional proteomics approaches to identify novel genes related to various aspects of disease biology.

The screen also identified novel functional activities for cDNAs known to be involved in other aspects of carcinogenesis. For example, researchers identified several proteins that stimulate migratory behavior, specifically, IL4, IL11and IL13. These results support previous findings and implications that this class of proteins may also be involved in bone metastases, and reflect the diverse functional activities of proteins that contribute to migratory behavior.

"The migration findings are particularly important, as historically the roles of genes in the process of invasion and metastasis--the most devastating aspects of cancer--have been very difficult to test," says LaBaer, "But tools such as BC1000 make this research much more accessible."

Several unexpected cDNAs were also found capable of inducing migration cooperatively when a known cancer associated cell-signaling pathway was also activated. For example, proteins SGK (serum and glucocorticoid-regulated kinase-1) and TNFRSF10B (tumor necrosis factor receptor, 10B) were both identified as pro-migratory, however they were previously recognized for their involvement in cell survival. The finding that cDNAs known to be involved in other cellular processes may also play a role in migration suggests that this approach may help uncover unanticipated activities for previously identified proteins.

The screen also identified genes that predictably and strongly induce cell proliferation. But in addition to the known genes, several other proteins that had not previously been implicated in cell proliferation were identified.

The morphogenesis and migration screens produced the greatest number of hits from the BC1000 cDNAs. Of the 75 cDNAs that induced cellular migration in the preliminary, single-pass screen, 66 were retested and 41 of these reproducibly scored as valid hits. The 41 validated migration hits were also reassessed in a morphogenesis assay, in which breast epithelial cells are able to organize into structures that resemble the glandular units of the normal breast. Of these 41 migration hits, 20 induced alterations in the morphology of such structures. The majority of these cDNAs prevented the formation of the typical hollow, spherical masses, and many of the disorganized structures showed a protrusive behavior resembling certain aspects of invasive tumor cells.

"Our labs will be following up on these new hits and further characterizing their meaning to the field," says Brugge and LaBaer. "This open process is exciting and hopefully will lead to the development of new therapy concepts."

There are several advantages of using a defined cDNA collection like the BC1000, compared to random, non-specific pooled cDNA libraries that have historically had a number of limitations that restricted the kind of detailed investigations necessary to fully interpret the causes of disease. In the BC1000 library, the identity of each clone is known; each clone is known to be of good quality, i.e. full-length and lack mutations; complex phenotypic assays are feasible as it is not necessary to sample millions of clones to compensate for redundancy found in pooled cDNA libraries; and lastly, there is more assurance that rare cDNAs are represented.


This work received significant funding from The Breast Cancer Research Foundation (, the Cell Migration Consortium (, and a program project and SPORE grant from the National Cancer Institute.


For Joan Brugge it's personal. Her dogged pursuit of the genes and proteins associated with breast cancer stem from a family member's fight with cancer. While in college, Brugge's sister, Mary Pat, only one year older, was found to have a brain tumor. "In probing the doctors for causes of her disease, I was very frustrated because many of my simple questions couldn't be answered," Brugge remembers. Joan, who originally was interested in math, instead turned her attention to the study of biology. After earning her Ph.D. in Virology in 1975, Brugge, while working with Ray Erikson at the Univ. of Colorado, isolated the protein coded for by the viral and cellular forms of the SRC gene, the first retroviral/cellular oncogene products to be identified: the study of the normal and oncogenic forms of this gene product has served as a model system to investigate cellular processes that regulate normal growth and the mechanisms involved in tumor formation. Brugge joined Harvard Medical School in July 1997, coming over from Cambridge-based ARIAD Pharmaceuticals where she was Scientific Director. Brugge brought to the medical school a combined passion for scientific understanding and a vision for knowledge that would lead to new therapies. Beyond the Breast Cancer 1000 project, her lab has developed a three-dimensional membrane model in which mammary epithelial cells can organize into structures resembling breast glands. This work will reveal how the identified breast cancer genes work in concert to bring about the disease. To learn more about Joan, visit:

Proteins, the final products of our genes, are both the machinery and bricks and mortar of all cells. Disease is most often caused by a malfunction of proteins, and nearly all drugs act by modifying protein function. Historically, scientists have been forced to study proteins one at a time. The promise of proteomics is to accelerate the study of disease by allowing researchers to watch thousands of proteins interact within their natural cellular environment and see how diseases form when things go awry. But proteomic researchers, unlike genomic researchers, have been constrained by their tools. Josh LaBaer, MD, PhD, director of the Harvard Institute of Proteomics, is trying to change the landscape. HIP was among the first to assemble human proteome libraries, which will allow researchers to produce single proteins of interest or thousands of them at a time. In addition, HIP is a leader in developing novel applications for the high throughput study of proteins. In addition to now developing the technologies and automation to screen the effects of thousands of genes on the behavior of mammalian cells, they have also developed a new type of protein microarray that allows thousands of proteins to be individually produced on small microscope slides that can be used for drug and biomarker discovery.

Illustrates Power of Large Scale Testing of Disease-Associated Proteins in Simulated Environment


  • Cells transitioning to metastasis following exposure to proteins
  • Large-scale micro array testing facility


  • The Tireless Soldier: One Researcher's Personal Fight Against Cancer
  • Securing The Promise Of Proteomics

    Harvard Medical School has more than 7,000 full-time faculty working in 10 academic departments housed on the School's Boston quadrangle or in one of 48 academic departments at 18 Harvard teaching hospitals and research institutes. Those Harvard hospitals and research institutions include Beth Israel Deaconess Medical Center, Brigham and Women's Hospital, Cambridge Health Alliance, the CBR Institute for Biomedical Research, Children's Hospital Boston, Dana-Farber Cancer Institute, Forsyth Institute, Harvard Pilgrim Health Care, Joslin Diabetes Center, Judge Baker Children's Center, Massachusetts Eye and Ear Infirmary, Massachusetts General Hospital, Massachusetts Mental Health Center, McLean Hospital, Mount Auburn Hospital, Schepens Eye Research Institute, Spaulding Rehabilitation Hospital, and the VA Boston Healthcare System.

    Last reviewed: By John M. Grohol, Psy.D. on 30 Apr 2016
        Published on All rights reserved.