Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DrugBank test data as a benchmark to evaluate Path Finder. #2315

Open
mohsenht opened this issue Jul 15, 2024 · 4 comments
Open

Use DrugBank test data as a benchmark to evaluate Path Finder. #2315

mohsenht opened this issue Jul 15, 2024 · 4 comments
Assignees

Comments

@mohsenht
Copy link
Collaborator

No description provided.

@mohsenht mohsenht self-assigned this Jul 15, 2024
@mohsenht
Copy link
Collaborator Author

mohsenht commented Sep 23, 2024

Use Containment Similarity as the metric, calculated as:
∣ Nodes in one path P ∩ Intermediate nodes ∣ / ∣ nodes in P ∣ #<-- rank individual paths
∣ Nodes in all Pathfinder paths ∩ Intermediate nodes ∣ / ∣Intermediate nodes ∣ #<-- rank full set

@mohsenht
Copy link
Collaborator Author

mohsenht commented Oct 11, 2024

Drug Bank sample test data:

{
  "PUBCHEM.COMPOUND:118856773": {
    "KG2_ID": "PUBCHEM.COMPOUND:118856773",
    "name": "Lepirudin",
    "category": "biolink:SmallMolecule",
    "drug_bank_id": "DB00001",
    "description": "Lepirudin is a recombinant hirudin formed by 65 amino acids that acts as a highly specific and direct thrombin inhibitor.[L41539,L41569] Natural hirudin is an endogenous anticoagulant found in _Hirudo medicinalis_ leeches.[L41539] Lepirudin is produced in yeast cells and is identical to natural hirudin except for the absence of sulfate on the tyrosine residue at position 63 and the substitution of leucine for isoleucine at position 1 (N-terminal end).[A246609] \r\n\r\nLepirudin is used as an anticoagulant in patients with heparin-induced thrombocytopenia (HIT), an immune reaction associated with a high risk of thromboembolic complications.[A3, L41539] HIT is caused by the expression of immunoglobulin G (IgG) antibodies that bind to the complex formed by heparin and platelet factor 4. This activates endothelial cells and platelets and enhances the formation of thrombi.[A246609] Bayer ceased the production of lepirudin (Refludan) effective May 31, 2012.[L41574]",
    "indication": "Lepirudin is indicated for anticoagulation in adult patients with acute coronary syndromes (ACS) such as unstable angina and acute myocardial infarction without ST elevation. In patients with ACS, lepirudin is intended for use with [aspirin].[L41539] Lepirudin is also indicated for anticoagulation in patients with heparin-induced thrombocytopenia (HIT) and associated thromboembolic disease in order to prevent further thromboembolic complications.[L41539]",
    "pharmacodynamics": "Lepirudin is a recombinant hirudin that acts as a highly specific thrombin inhibitor. Its activity is measured by anti-thrombin units (ATUs) that correspond to the amount of lepirudin required to neutralize a unit of the World Health Organization \u03b1-thrombin (89/588) standard. The activity of lepirudin is 16,000 ATU/mg.[L41539,L41569] A single molecule of lepirudin binds to a molecule of thrombin, blocking its thrombogenic activity. This drug increases activated partial thromboplastin time (aPTT)  and PT (INR) values in a dose-dependent manner, and its mode of action is independent of antithrombin III.[L41539,L41569] Platelet factor 4 does not inhibit lepirudin.[L41539,L41569]\r\n\r\nThe pharmacodynamic effect of lepirudin was evaluated by measuring an increase in aPTT. No saturable effect was observed at the highest tested dose (0.5 mg/kg, IV bolus).[L41539] Thrombin time was considered an unsuitable routine test for lepirudin monitoring due to the high values detected (200 seconds) even at low doses.[L41539] The concomitant use of thrombolytic therapy and lepirudin is not recommended due to the high risk of bleeding that may be life-threatening. In patients with a risk of bleeding, a physician should weigh the risks of lepirudin administration against its benefits.  There is also an especially high risk of bleeding in patients who weigh less than 50 kg, and a lower dosage is required. Patients with renal impairment have a higher risk of hemorrhagic adverse events.[L41539]",
    "mechanism_of_action": "Lepirudin is a direct thrombin inhibitor used as an anticoagulant in patients for whom heparin is contraindicated.[L41539,A3] Thrombin is a serine protease that participates in the blood-clotting cascade, and it is formed by the cleavage of pro-thrombin. Active thrombin cleaves fibrinogen and generates fibrin monomers that polymerize to form fibrin clots.[A246624]\r\n\r\nLepirudin binds to the catalytic and substrate-binding sites of thrombin, forming a stable, irreversible and non-covalent complex.[A246609] This blocks the protease activity of thrombin and inhibits the coagulation process. Each molecule of lepirudin binds to a single molecule of thrombin,[L41539] and unlike [heparin], it is able to inhibit thrombin in both its clot-bound or free states.[A246609]",
    "metabolism": "As a polypeptide, lepirudin is expected to be metabolized by the sequential cleavage of amino acids by kidney exoproteases, which have carboxypeptidase and dipeptidase-like activity.[L41539,L41544] The C-terminal cleavage of lepirudin aminoacids (aminoacids 1 to 65) produces four metabolites with anti-thrombotic activity: M1 (aminoacids 1 to 64), M2 (aminoacids 1 to 63), M3 (aminoacids 1 to 62), and M4 (aminoacids 1 to 61).[L41544]",
    "transporters": {
      "names": [],
      "ids": []
    },
    "enzymes": {
      "names": [],
      "ids": []
    },
    "targets": {
      "names": [
        "Prothrombin",
        "F2"
      ],
      "ids": [
        "BE0000048",
        "P00734"
      ]
    },
    "carriers": {
      "names": [],
      "ids": []
    },
    "pathways": {
      "ids": [
        "SMPDB:SMP0000278"
      ],
      "enzymes": {
        "ids": [
          "UniProtKB:P00734",
          "UniProtKB:P00748",
          "UniProtKB:P02452",
          "UniProtKB:P03952",
          "UniProtKB:P03951",
          "UniProtKB:P00740",
          "UniProtKB:P00451",
          "UniProtKB:P12259",
          "UniProtKB:P00742",
          "UniProtKB:P02671",
          "UniProtKB:P02675",
          "UniProtKB:P02679",
          "UniProtKB:P00488",
          "UniProtKB:P05160",
          "UniProtKB:P00747",
          "UniProtKB:P00750",
          "UniProtKB:P08709",
          "UniProtKB:P13726",
          "UniProtKB:Q9BQB6",
          "UniProtKB:P38435"
        ]
      }
    },
    "indication_NER_aligned": {
      "MONDO:0005542": {
        "name": "acute coronary syndromes",
        "category": "biolink:Disease"
      },
      "MONDO:0006805": {
        "name": "unstable angina",
        "category": "biolink:Disease"
      },
      "MONDO:0004781": {
        "name": "acute myocardial infarction",
        "category": "biolink:Disease"
      },
      "MONDO:0018048": {
        "name": "heparin-induced thrombocytopenia",
        "category": "biolink:Disease"
      },
      "HP:0001907": {
        "name": "associated thromboembolic disease",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C0009566": {
        "name": "complications",
        "category": "biolink:Disease"
      }
    },
    "mechanistic_intermediate_nodes": {
      "PUBCHEM.COMPOUND:118856773": {
        "name": "Lepirudin",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C0019573": {
        "name": "hirudin",
        "category": "biolink:Protein"
      },
      "UMLS:C0002520": {
        "name": "lepirudin aminoacids",
        "category": "biolink:Protein"
      },
      "UMLS:C0003440": {
        "name": "thrombin inhibitor",
        "category": "biolink:Protein"
      },
      "ttd.target:Hirudin": {
        "name": "hirudin",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C0003280": {
        "name": "anticoagulant",
        "category": "biolink:Drug"
      },
      "MONDO:0001191": {
        "name": "leeches",
        "category": "biolink:Disease"
      },
      "PUBCHEM.COMPOUND:1117": {
        "name": "sulfate",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C0023401": {
        "name": "leucine",
        "category": "biolink:Protein"
      },
      "PUBCHEM.COMPOUND:6306": {
        "name": "isoleucine",
        "category": "biolink:SmallMolecule"
      },
      "PUBCHEM.COMPOUND:6057": {
        "name": "tyrosine",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C1323338": {
        "name": "N-terminal end",
        "category": "biolink:MolecularActivity"
      },
      "CHEBI:32789": {
        "name": "tyrosine residue",
        "category": "biolink:SmallMolecule"
      },
      "MONDO:0018048": {
        "name": "heparin-induced thrombocytopenia",
        "category": "biolink:Disease"
      },
      "UMLS:C0003314": {
        "name": "immune reaction",
        "category": "biolink:MolecularActivity"
      },
      "UMLS:C0332167": {
        "name": "high risk",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C0009566": {
        "name": "complications",
        "category": "biolink:Disease"
      },
      "UMLS:C4319571": {
        "name": "high risk",
        "category": "biolink:PhenotypicFeature"
      },
      "PUBCHEM.COMPOUND:22833565": {
        "name": "heparin",
        "category": "biolink:SmallMolecule"
      },
      "PR:000012569": {
        "name": "platelet factor 4",
        "category": "biolink:Protein"
      },
      "UMLS:C0225336": {
        "name": "endothelial cells",
        "category": "biolink:Cell"
      },
      "UMLS:C0005821": {
        "name": "platelets",
        "category": "biolink:Cell"
      },
      "UMLS:C0087086": {
        "name": "thrombi",
        "category": "biolink:Disease"
      },
      "MONDO:0005542": {
        "name": "acute coronary syndromes",
        "category": "biolink:Disease"
      },
      "MONDO:0006805": {
        "name": "unstable angina",
        "category": "biolink:Disease"
      },
      "MONDO:0004781": {
        "name": "acute myocardial infarction",
        "category": "biolink:Disease"
      },
      "NCBIGene:39836": {
        "name": "ST",
        "category": "biolink:Gene"
      },
      "NCBIGene:833655": {
        "name": "ACS",
        "category": "biolink:Gene"
      },
      "HP:0001907": {
        "name": "associated thromboembolic disease",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C0309872": {
        "name": "prevent",
        "category": "biolink:Drug"
      },
      "PATO:0000070": {
        "name": "amount",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C0233645": {
        "name": "blocking",
        "category": "biolink:Disease"
      },
      "UMLS:C0013227": {
        "name": "drug",
        "category": "biolink:Drug"
      },
      "PUBCHEM.COMPOUND:23939": {
        "name": "PT",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C0003438": {
        "name": "antithrombin III",
        "category": "biolink:Protein"
      },
      "NCBIGene:42549": {
        "name": "INR",
        "category": "biolink:Gene"
      },
      "PR:000003252": {
        "name": "antithrombin III",
        "category": "biolink:Protein"
      },
      "PUBCHEM.COMPOUND:145068": {
        "name": "No",
        "category": "biolink:SmallMolecule"
      },
      "UMLS:C0442726": {
        "name": "detected",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C0427596": {
        "name": "Thrombin time",
        "category": "biolink:PhenotypicFeature"
      },
      "NCIT:C26791": {
        "name": "bleeding",
        "category": "biolink:PhenotypicFeature"
      },
      "UMLS:C4303743": {
        "name": "life-threatening",
        "category": "biolink:Disease"
      },
      "UMLS:C2826244": {
        "name": "life-threatening",
        "category": "biolink:PhenotypicFeature"
      },
      "MONDO:0001106": {
        "name": "renal impairment",
        "category": "biolink:Disease"
      },
      "UMLS:C0877248": {
        "name": "adverse events",
        "category": "biolink:Disease"
      },
      "UMLS:C0030956": {
        "name": "polypeptide",
        "category": "biolink:Protein"
      },
      "UMLS:C1524026": {
        "name": "metabolized",
        "category": "biolink:PhysiologicalProcess"
      },
      "UMLS:C0010813": {
        "name": "cleavage",
        "category": "biolink:PhysiologicalProcess"
      },
      "UMLS:C1150160": {
        "name": "dipeptidase-like activity",
        "category": "biolink:MolecularActivity"
      },
      "UMLS:C0870883": {
        "name": "metabolites",
        "category": "biolink:SmallMolecule"
      },
      "CHEBI:8583": {
        "name": "Prothrombin",
        "category": "biolink:ChemicalEntity"
      },
      "NCBIGene:2147": {
        "name": "F2",
        "category": "biolink:Gene"
      }
    }
  }
}

@mohsenht
Copy link
Collaborator Author

Hi @dkoslicki,

I’m considering using only the core algorithm of PathFinder to test it with this data, without incorporating the Expander results (one hop and two hops, #2398). This approach would allow us to improve PathFinder without introducing any bias from the Expander.

Do you agree with this approach of tuning PathFinder separately, without using the Expander results, which we plan to merge at a higher level later?

@dkoslicki
Copy link
Member

Yes, @mohsenht , that makes sense, and will make the results faster and more reproducible as we won't need to rely on external KPs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants