Feature request: Neighborhood groups in milopy #49

MaximilianNuber · 2024-11-18T12:40:09Z

I like to use Milo for analyzing intra-celltype heterogeneity, particularly with neighborhood groups.
To my understanding, neighborhood groups are currently implemented only in R.
Therefore, I would like to ask if this feature will be added to milopy?

In case that there are no preparations for this, yet, would you be interested to implement neighborhood groups on basis of the following code? For any case, I would be thankful for any help to make it work correctly.

As an example I process the embryo gastrulation data from the MiloR tutorial in R, calculate neighborhood groups, transfer to python and calculate neighborhood groups for comparison with code I adapted from the clustering modules in scanpy.

Code for examples:

python-code for find_nhood_groups:

from contextlib import contextmanager, suppress
def get_igraph_from_adjacency(adjacency, directed=None):
    """Get igraph graph from adjacency matrix. Copy/pasted from scanpy."""
    import igraph as ig

    sources, targets = adjacency.nonzero()
    weights = adjacency[sources, targets]
    if isinstance(weights, np.matrix):
        weights = weights.A1
    g = ig.Graph(directed=directed)
    g.add_vertices(adjacency.shape[0])  # this adds adjacency.shape[0] vertices
    g.add_edges(list(zip(sources, targets)))
    with suppress(KeyError):
        g.es["weight"] = weights
    if g.vcount() != adjacency.shape[0]:
        logg.warning(
            f"The constructed graph has only {g.vcount()} nodes. "
            "Your adjacency matrix contained redundant nodes."
        )
    return g

def find_nhood_groups(
    adata, 
    SpatialFDR_threshold = 0.1,
    merge_discord = False,
    max_lfc_delta=None,
    overlap=1,
    subset_nhoods=None,
):
    """Get igraph graph from adjacency matrix. Adjusted from scanpy louvain clustering."""
    nhs = adata.obsm["nhoods"].copy()
    nhood_adj = adata.uns["nhood_adata"].obsp["nhood_connectivities"].copy()
    da_res = adata.uns["nhood_adata"].obs.copy()
    is_da = np.asarray(da_res.SpatialFDR <= SpatialFDR_threshold)

    # da_res.loc[is_da, 'logFC'].values @ (da_res.loc[is_da, 'logFC']).T.values

    if merge_discord is False:
        # discord_sign = np.sign(da_res.loc[is_da, 'logFC'].values @ (da_res.loc[is_da, 'logFC'])) < 0
        # Assuming da.res is a pandas DataFrame and is.da is a boolean array
        discord_sign = np.sign(da_res.loc[is_da, 'logFC'].values @ da_res.loc[is_da, 'logFC'].values.T) < 0
        # nhood_adj[is_da, is_da][discord_sign] <- 0
        nhood_adj[(is_da, is_da)][discord_sign] = 0

    if overlap > 1:
        nhood_adj[nhood_adj < overlap] = 0

    if max_lfc_delta is not None:  
        lfc_diff = np.array([da_res['logFC'].values - x for x in da_res['logFC'].values])
        nhood_adj[np.abs(lfc_diff) > max_lfc_delta] = 0

    # binarise
    # nhood_adj = np.asarray((nhood_adj > 0) + 0)
    nhood_adj = (nhood_adj > 0).astype(int)

    g = get_igraph_from_adjacency(nhood_adj, directed = False)

    weights = None
    part = g.community_multilevel(weights=weights)

    groups = np.array(part.membership)

    adata.uns["nhood_adata"].obs["nhood_groups"] = groups

Example code -- R processing

%%R
suppressMessages(library(MouseGastrulationData))
library(miloR)
library(SingleCellExperiment)
library(scater)
library(scran)
library(dplyr)
library(patchwork)

select_samples <- c(2,  3,  6, 15,
                    # 4, 19, 
                    10, 14, 20, 30
                    #31, 32
                    )
embryo_data = EmbryoAtlasData(samples = select_samples)

embryo_data <- embryo_data[,apply(reducedDim(embryo_data, "pca.corrected"), 1, function(x) !all(is.na(x)))]
embryo_data <- runUMAP(embryo_data, dimred = "pca.corrected", name = 'umap')

embryo_milo <- Milo(embryo_data)
embryo_milo <- buildGraph(embryo_milo, k = 30, d = 30, reduced.dim = "pca.corrected")
embryo_milo <- makeNhoods(embryo_milo, prop = 0.1, k = 30, d=30, refined = TRUE, reduced_dims = "pca.corrected")
# plotNhoodSizeHist(embryo_milo)

embryo_milo <- countCells(embryo_milo, meta.data = as.data.frame(colData(embryo_milo)), sample="sample")

embryo_design <- data.frame(colData(embryo_milo))[,c("sample", "stage", "sequencing.batch")]

## Convert batch info from integer to factor
embryo_design$sequencing.batch <- as.factor(embryo_design$sequencing.batch) 
embryo_design <- distinct(embryo_design)
rownames(embryo_design) <- embryo_design$sample

embryo_milo <- calcNhoodDistance(embryo_milo, d=30, reduced.dim = "pca.corrected")

da_results <- testNhoods(embryo_milo, design = ~ sequencing.batch + stage, design.df = embryo_design, reduced.dim = "pca.corrected")

embryo_milo <- buildNhoodGraph(embryo_milo)

## Plot single-cell UMAP
umap_pl <- plotReducedDim(embryo_milo, dimred = "umap", colour_by="stage", text_by = "celltype", 
                          text_size = 3, point_size=0.5) +
  guides(fill="none")

## Plot neighbourhood graph
nh_graph_pl <- plotNhoodGraphDA(embryo_milo, da_results, layout="umap",alpha=0.1) 

## Add log normalized count to Milo object
embryo_milo <- logNormCounts(embryo_milo)

da_results$NhoodGroup <- as.numeric(da_results$SpatialFDR < 0.1 & da_results$logFC < 0)
da_nhood_markers <- findNhoodGroupMarkers(embryo_milo, da_results, subset.row = rownames(embryo_milo)[1:10])

## Run buildNhoodGraph to store nhood adjacency matrix
embryo_milo <- buildNhoodGraph(embryo_milo)

## Find groups
da_results <- groupNhoods(embryo_milo, da_results, max.lfc.delta = 2)
plotNhoodGroups(embryo_milo, da_results, layout="umap")

Transfer to python:

adata = anndata2ri.rpy2py(r('embryo_data'))
adata.obsm["nhoods"] = scipy2ri.rpy2py(r('nhoods(embryo_milo)'))

nhoodCounts = scipy2ri.rpy2py(r('(nhoodCounts(embryo_milo))'))
nhood_adata = anndata.AnnData(X=nhoodCounts)

with localconverter(default_converter+numpy2ri.converter):
    nhood_adata.obs_names = (r('rownames(nhoodCounts(embryo_milo))'))

with localconverter(default_converter+numpy2ri.converter):
    nhood_adata.var_names = (r('colnames(nhoodCounts(embryo_milo))'))

with localconverter(default_converter+pandas2ri.converter): 
    da_results = (r('da_results'))

nhood_adata.obs = pd.concat([nhood_adata.obs, da_results], axis = 1)

adata.uns["nhood_adata"] = nhood_adata

with localconverter(default_converter+numpy2ri.converter):
    adata.uns["nhood_adata"].obs["index_cell"] = r('colnames(embryo_milo)[unlist(nhoodIndex(embryo_milo))]')

adata.uns["nhood_adata"].obsp["nhood_connectivities"] = scipy2ri.rpy2py(r('nhoodAdjacency(embryo_milo)'))

nhood_sizes = adata.obsm["nhoods"].sum(axis=0)
nhood_sizes = np.asarray(nhood_sizes)
nhood_sizes = nhood_sizes.ravel()
adata.uns["nhood_adata"].obs["Nhood_size"] = nhood_sizes
# adata.uns["nhood_adata"].obs["nhood_groups"] = adata.uns["nhood_adata"].obs["nhood_groups"].astype("str")

%%R -o milo_graph
suppressMessages(library(dplyr))
milo_graph = nhoodIndex(embryo_milo) %>%
    unlist() %>%
    reducedDim(embryo_milo, "umap")[., ] %>%
    as.data.frame

milo_graph = milo_graph.loc[adata.uns["nhood_adata"].obs["index_cell"], :]

milo_graph = milo_graph.reset_index(drop = True)
milo_graph.index = np.asarray(milo_graph.index)+1
adata.uns["nhood_adata"].obsm["X_milo_graph"] = milo_graph.values

find_nhood_groups(adata, max_lfc_delta = 2)
adata.uns["nhood_adata"].obs["nhood_groups"] = adata.uns["nhood_adata"].obs["nhood_groups"].astype("str")

plt_df = pd.DataFrame(adata.uns["nhood_adata"].obsm["X_milo_graph"],
                      columns = ["UMAP_"+str(i+1) for i in range(2)]
                     )

plt_df["nhood_groups"] = adata.uns["nhood_adata"].obs["nhood_groups"].values
plt_df["nhood_groups"] = plt_df["nhood_groups"].astype(str)

plt_df["size"] = np.asarray(adata.uns["nhood_adata"].X.sum(axis = 1)).ravel()
plt_df = plt_df.sort_values("nhood_groups")

label_df = label_df = (
    plt_df
    .groupby("nhood_groups")
    .median()
)

from legendkit import size_legend, cat_legend, vstack


# groups = plt_df["nhood_groups"].unique()
groups = []
[groups.append(n) for n in plt_df["nhood_groups"] if n not in groups]
print(groups)

palette = [color for color in sns.color_palette("Paired", 12) for _ in range(len(groups))]
sns.scatterplot(
    x=plt_df["UMAP_1"],
    y=plt_df["UMAP_2"],
    size = plt_df["size"], #.apply(lambda x: x*0.2),
    hue = plt_df["nhood_groups"],#.astype("category").cat.codes,
    legend = False,
    linewidth=0,
    palette = sns.color_palette("Paired", 12)
)
sns.despine()
slegend = size_legend(adata.uns["nhood_adata"].obs.Nhood_size.to_list(), loc = "out right center",
           title = "Nhood size", )

color = dict(
    colors = sns.color_palette("Paired", 12),
    labels = groups
)

ax = plt.gca()
ax.set_title("Python Neighborhood groups", fontsize = 18)
handles, labels = ax.get_legend_handles_labels()

clegend = cat_legend(**color, title = "Nhood group", handle = "circle")
vstack([clegend, slegend], # title="Stack Vertically",
       loc="out right center", spacing=10, frameon=False, ax=ax)

for _group, row in label_df.iterrows():
    plt.text(row["UMAP_1"], row["UMAP_2"], _group, fontsize = 10)

Best regards,
max

The text was updated successfully, but these errors were encountered:

emdann · 2024-11-20T01:56:11Z

Hi Max! Thank you for taking the time to code this up. I am soon going to archive this python package, since we've moved and improved all the functionality in here within the Pertpy toolbox https://github.com/scverse/pertpy/tree/main (see this tutorial for example usage). I would recommend posting a feature request in pertpy - you can reference this issue - and we can take it from there! It should be relatively easy to implement if there's sufficient interest in this feature

MaximilianNuber · 2024-11-21T13:01:09Z

Thank you for your reply!
I submitted the issue here: scverse/pertpy#680

Best regards, max

MaximilianNuber mentioned this issue Nov 21, 2024

Milo neighborhood groups as in MiloR scverse/pertpy#680

Open

MaximilianNuber closed this as completed Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Neighborhood groups in milopy #49

Feature request: Neighborhood groups in milopy #49

MaximilianNuber commented Nov 18, 2024 •

edited

Loading

emdann commented Nov 20, 2024

MaximilianNuber commented Nov 21, 2024

Feature request: Neighborhood groups in milopy #49

Feature request: Neighborhood groups in milopy #49

Comments

MaximilianNuber commented Nov 18, 2024 • edited Loading

emdann commented Nov 20, 2024

MaximilianNuber commented Nov 21, 2024

MaximilianNuber commented Nov 18, 2024 •

edited

Loading