Starfish Enterprise: Finding RNA Patterns in Single Cells

For cinephiles, Space Jam was a 1996 comedy film pitting cartoon character Bugs Bunny and basketball player Michael Jordan against animated aliens. For neuroscientist Ed Lein, it was the name of a bioinformatics-themed meet-up — a type of ‘hackathon’.

In April, around 40 computational and transcriptional biologists turned up at the Allen Institute for Brain Science in Seattle, Washington, where Lein works. They came for coffee, coding and a common goal: to work out the strengths, weaknesses and analytical challenges of the growing methodological toolset known as in situ (or spatial) transcriptomics.

In situ transcriptomics is an alphabet soup of technologies — methods include MERFISH, seqFISH+, STARmap and FISSEQ — for mapping the gene-expression patterns of cells in their tissue context. Some rely on hybridization — the ability of short nucleic-acid probes to find their complements in the crowded cellular environment — whereas others are based on DNA sequencing. But all produce conceptually similar data — gene-expression values matched to the x and y coordinates of a cell.

Such data can reveal intercellular relationships that might otherwise be overlooked, such as which cells are talking to which, and their position relative to structural features and cells of interest. As Aviv Regev, a computational and systems biologist, and founding co-chair of the Human Cell Atlas (HCA) project at the Broad Institute of MIT and Harvard in Cambridge, Massachusetts, puts it: “Tell me who your neighbour is and I’ll tell you who you are.”

But so rapid is the field’s growth that researchers might struggle to decide which methods to use. And the plethora of data-analysis algorithms, pipelines and file formats can make it challenging to analyse and compare data. “The state of the field has been one of rampant technology development,” Lein says.

With funding from philanthropic organization the Chan Zuckerberg Initiative (CZI) and under the auspices of the HCA, Lein and others formed a research consortium in 2017 to benchmark the different methods, called SpaceTx — short for spatial transcriptomics. At the same time, programmers at the CZI began building a unified data-analysis tool and file format, called Starfish, to advance the HCA’s efforts and aid the wider transcriptional-biology community. (The name “is a bit of a joke”, explains Jeremy Freeman, who directs computational-biology efforts at the CZI in Redwood City, California. Many spatial methods rely on FISH, or fluorescence in situ hybridization. In programming, an asterisk or star indicates a wildcard. “The joke is that they’re all ‘something-FISH’.”)

Starfish is an open-source software suite that can read image files, register and remove the noise from pictures, find spots and identify the RNA molecules that they represent in nine different experimental strategies, with two more in development. The Space Jam event, Lein says, was an effort to bring developers and users — the spatial-transcriptomics specialists themselves — together to talk shop, troubleshoot and advance their methods. In so doing, the team exposed the subtle differences that can trip up those who want, for instance, to compare data across experiments. But it also provided a model for how to navigate a fast-growing technology.

In situ transcriptomics

Researchers studying gene expression have usually done so at the bulk level, extracting RNA from a piece of tissue and then analysing it in its entirety. Over the past decade, single-cell methods such as Drop-seq have allowed researchers to probe the differences between cells at the expense of spatial detail.

That’s where in situ transcriptomics comes in. These techniques use mostly fluorescence microscopy and DNA sequencing to reveal the presence and abundance of RNA molecules in cells within the tissues themselves. From there, researchers can work out the types of cell that are present, their spatial arrangement and their relationships to one another.

It’s like a selection of fruity desserts, Regev says. “If all bulk genomics is the fruit smoothie, then single-cell genomics is the fruit salad, and spatial genomics is the fruit tart,” she explains. “If you look at a fruit tart from the top, all the fruits are organized in these really beautiful patterns.”

Depending on the method, such data can resemble stars in a pitch-black sky, or colourful works of art. One study led by Simone Codeluppi, a bioimage informatician in the laboratory of Sten Linnarsson at the Karolinska Institute in Stockholm, for instance, used a cyclic variant of single-molecule FISH, called osmFISH (pronounced ‘awesome fish’), to map the architecture of the mouse somatosensory cortex. The result was an image of the cells coloured on the basis of their gene-expression patterns, a picture that is reminiscent of a stained-glass window¹.

But such data can also reveal insights. At the University of Cambridge, UK, neurobiologist and physician David Rowitch has used a method called RNAscope to study the spatial diversity and organization of astrocytes in the mouse brain². Astrocytes, Rowitch found, “adopt layer patterns in the cortex similar to, but out of register with, neurons”. Long Cai, who studies single-cell biology at the California Institute of Technology in Pasadena, and his team used a strategy called seqFISH+ to identify transcripts encoding interacting proteins on the surfaces of adjacent cells³.

Providing clarity

Both seqFISH+ and RNAscope rely on nucleic-acid hybridization; they leverage short, fluorescently labelled molecules to light up their target sequences in the cell. Other methods use DNA sequencing or even mass spectrometry (see ‘Alphabet soup’).

More than a dozen spatial-transcriptomics methods have been described, including six in 2019³^–⁸. They differ in the number of RNAs that they can detect, their spatial resolution and the number of cells they can probe, but all provide the spatial localization detail that single-cell transcriptomics cannot. But spatial methods have shortcomings too, says Regev. Microscopy, for instance, is slow (sometimes involving weeks of continuous imaging), expensive and technically demanding. Many methods can access only a predefined fraction of the cellular transcriptome, and practical considerations can limit the number of cells that can be probed.

Starfish Enterprise: Finding RNA Patterns in Single Cells

In situ transcriptomics

Providing clarity

Categories