Agony
Authors: James Bannon, Julia Nguyen, Matthew Boccalon, Jermiah Joseph
Contact: bhklab.jermiahjoseph@gmail.com
Description: Agony implementation
Introduction
The notion of graph agony was developed out of a desire to find hierarchies in directed networks [2]. Intuitively, a hierarchy -- in an organization or social network, for example -- involves assigning levels or ranks to people such that the ranks are ordered. For example, the CEO of a company should rank higher than an intern.
Agony works by trying to find a hierarchy that best reflects that structure of a given directed network. At a very high level the algorithms in [2], [3], and [4] operate on the same way:
- Input: A directed graph $G=(V,E)$.
- Output: A rank function $r^*:V\to \mathbb{N}$, and a value $A(G)$ that is the agony of the graph $G$.
The values of $r^*$ and $A(G)$ are found by solving the following minimization problem:
$$\underset{r:V\to\mathbb{N}}{\min}\sum_{(u,v)\in E}\max(0,r(u)-r(v)+1)$$
where $r^*$ is the function that minimizes the above sum and $A(G)$ is the value achieved.
Immediate Next Steps
The most immediate next steps involve implementing the ability to compute the weighted and unweighted versions of agony as described in [3] and [4], respectively. We should do this at least in first in pure python.
Below are some next steps along with tentative assignments to project members.
TO DO:
- Read over Tiers for Peers [4] and define the most general form of the agony problem. (James)
- Re-write the linear program agony code such that it's more readable. (James)
- Write the C++
code in python
likely using the heapq
module for the priority queue. R
also has a priority queue implemented. (James, Jermiah, Matthew)
- De-bug and benchmark the python
implementation against its C++
counterpart. (Jermiah, Matthew)
- Investigate the weighted agony algorithm defined in [4]. (James, Jermiah, Matthew). The author of [4] has made a C++ implementation available. The files are also in this repo in the agony_cpp
folder.
Project Ideas
Software Paper
If we can write a decent python
and/or R
package we can release it as its own software paper. It might be worth talking to Ben about this because it's not clear if re-writing publicly available code is a worthy paper. That said, making a library available publicly in pure python
or R
is maybe worth circulating/publishing as a matter of record.
Pharmacogenomics Papers
The way I see it, there are two ways that we can proceed with a PGx style paper. One way involves using the agony itself as a biomarker. The second involves using agony for rank aggregation.
Agony As A Biomarker
In the unweighted case, the agony is bounded above by $|E|$ and the authors in [2] define a measure of hierarchy of a graph via $h(G)=1-\frac{1}{|E|}A(G)$. It follows then that the agony value $A(G)$, even weighted, can be thought of as a measure of organizational disorder. Since cancer is defined by escaping the normally tightly controlled regulatory set up of the cell cycle [12] it makes sense that cancer samples should have higher agony. The steps for this project would be something like the following.
-
Pick a collection of samples that can be easily and meaningfully partitioned (e.g., responders vs. nonresponders to ICIs, cancer vs. normal samples).
-
For each group construct a directed gene regulatory network (GRN) using one of the existing methods: [1], [5], [8], [9]
-
Compare the agony for the graphs across the two groups.
Alternatively we could do this on a patient-level by looking at single-cell GRN inference ([7]). Another option would be to look at the agony of cell-cell communication networks [11].
Agony For Drug Recommendation
This is as presented in the PGx meeting. K-Nearest neighbors using, e.g.[10].
Set Up
Prerequisites
Pixi is required to run this project. If you haven't installed it yet, follow these instructions
Installation
- Clone this repository to your local machine
- Navigate to the project directory
- Set up the environment using Pixi:
pixi install
Usage
agony bindings are available in the agony
module.
I've setup two tasks to showcase this
pixi run help
pixi run cycle_dfs
Documentation
Click here to view the full documentation.
References
📄 Annotated List of Papers
[1.] Inferring Genome-Wide Interaction Networks Using the Phi-Mixing Coefficient
^[This paper introduces a novel algorithm based on the φ-mixing coefficient from probability theory to infer genome-wide interaction networks. The method allows for the construction of weighted and directed networks that can contain cycles, and it has been applied to study subtypes of lung cancer, including small cell (SCLC) and non-small cell (NSCLC), as well as normal lung tissue. There is a github repo with an implementation that is a combination of C and Matlab.]
[2.] Finding Hierarchy in Directed Online Social Networks
^The authors define a measure of hierarchy within directed online social networks and present an efficient algorithm to compute this measure. This work provides insights into the structural organization of social networks and how hierarchical relationships can be identified and quantified. oai_citation:0‡ACM Digital Library
[3.] Faster Way to Agony: Discovering Hierarchies in Directed Graphs
^This paper presents an improved algorithm for computing "agony," a metric used to quantify the hierarchical structure in directed graphs. The proposed method reduces the computational complexity from O(nm²) to O(m²), making it more practical for analyzing large-scale networks. oai_citation:1‡arXiv
[4.] Tiers for Peers: A Practical Algorithm for Discovering Hierarchy in Weighted Networks
[5.] Reconstructing Directed Gene Regulatory Network by Only Gene Expression Data
[6.] Resolution of Ranking Hierarchies in Directed Networks
^[ranked stochastic block models]
[7.] SCENIC: single-cell regulatory network inference and clustering
^[A paper that infers single cell gene regulatory networks. ]
[8.] Inferring Regulatory Networks from Expression Data Using Tree-Based Methods
^[The GENIE3 algorithm for inferring directed gene regulatory networks. There exists a github repo with implementations in C, R, Matlab, and python. This is used in the SCENIC pipeline.]
[9.] bLARS: An Algorithm to Infer Gene Regulatory Networks
^[Another algorithm for inferring directed gene regulatory networks from gene expression.]
[10.] Integrate Any Omics: Towards genome-wide data integration for patient stratification
^[a Deep Learning/AI paper for learning patient similarity.]
[11.] Screening cell–cell communication in spatial transcriptomics via collective optimal transport
^[cell cell coms]
[12.] Intrinsic disorder in cell-signaling and cancer-associated proteins
^[cell disorder]