From e34c575147e77eb1970078111e8a681b56333f81 Mon Sep 17 00:00:00 2001 From: Gal Barel Date: Fri, 29 May 2020 09:44:28 +0200 Subject: [PATCH] added running times to tutorial and note about input network conectedness --- Tutorial.ipynb | 33 ++++++++++++++++++++++++--------- 1 file changed, 24 insertions(+), 9 deletions(-) diff --git a/Tutorial.ipynb b/Tutorial.ipynb index 74d228a..73d1566 100644 --- a/Tutorial.ipynb +++ b/Tutorial.ipynb @@ -46,7 +46,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The PPI network should be provided in a form of edge list, a file with two columns, where each row represents an interaction." + "The PPI network should be provided in a form of edge list, a file with two columns, where each row represents an interaction.\n", + "\n", + "The PPI must be fully connected in order to apply NetCore. If that is not the case, then NetCore will extract the largest component from the network and use only the nodes and the edges in this component. The same procedure is applied when generating network permutations." ] }, { @@ -157,11 +159,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Run-time memory warning:** Executing the next funtion will be executed assuming it can use 64 cores (or more). If you're running this on a computer with less cores, make sure to first change the ```num_cores``` argument accordingly. \n", + "**Run-time memory warning:** Executing the next funtion will be executed assuming it can use 64 cores (or more). If you're running this on a computer with less cores, make sure to first change the ```num_cores``` argument accordingly. Running time for 1 permutation (with swap_factor=100) is just under 45 minutes (with one core). Therefore, it is recommented to use multiple cores and generate at least 100 permutations. This can be done once only and later be repeatedly used.\n", "\n", "To generate permutations in a parallel fashion, the user can change the ```num_cores``` argument in the function to be larger than ```1``` and then multiple processes will be generated (as many as the number of given cores).\n", "\n", - "**PLEASE NOTE:** the function takes long to run. It is recommented to use multiple cores and generate at least 100 permutations. This can be done once only and later be repeatedly used.\n", + "Use ```num_perm``` to change the number of permutations that are created.\n", + "\n", + "Use ```swap_factor``` to change the number attmeps for edge swaps, wich is given by: number_of_edges X swap_factor. (For more details, please see [the documentaion page of the Networkx package](https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.swap.connected_double_edge_swap.html).\n", "\n", "The permutations files will be saved in a new sub-directory of the given output path, and will be named as the given network name with a suffix of *edge_permutations*. So in the follwoing example the sub-directory will be named *CPDB_high_confidence_edge_permutations*." ] @@ -287,6 +291,7 @@ " net_name=\"CPDB_high_confidence\",\n", " output_path=\"data/\",\n", " num_perm=100,\n", + " swap_factor=100,\n", " num_cores=64)" ] }, @@ -513,6 +518,8 @@ "source": [ "**Run-time memory warning:** NetCore is computes the fractional power of a matrix using [scipy](http://lagrange.univ-lyon1.fr/docs/scipy/0.17.1/generated/scipy.linalg.fractional_matrix_power.html) which might take long and be intesive to process. It is recommened to use a machine with at least 64 cores. \n", "\n", + "On a computer with 64 cores, with 100 permutations, the total running time of NetCore is just under 60 min. Using more permutations will increase the running time in ~30 seconds for each permutation.\n", + "\n", "To run NetCore the ```netcore.py``` script must be executed using ```python3``` with at least 4 arguments. The user can chose to input either a weights file or a seed file (or both), but at least one must be provided. " ] }, @@ -527,17 +534,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "``` -pd``` : The path to the directory where the permutation files of the network are stored\n", + "\n", "``` -e``` : Network file in an edge list format\n", "\n", "``` -w``` : Weights file for some or all nodes in the network\n", "\n", - "*or*\n", - "\n", - "``` -s``` : Seed file with seed nodes for binary propagation and/or module identification\n", - "\n", - "``` -pd``` : The path to the directory where the permutation files of the network are stored\n", + "*and/or*\n", "\n", - "``` -o``` (optional) : Dirctory for output files (if not give, default is the current directory) \n" + "``` -s``` : Seed file with seed nodes for binary propagation and/or module identification\n" ] }, { @@ -1191,6 +1196,15 @@ "### Optional arguments (and their default values)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Set the dirctory for output files (if not give, default is the current directory) \n", + "\n", + "``` -o path ```" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -1236,6 +1250,7 @@ "Set the **thresholds** for the **pvalue** and **weight** for the module identification step (if a weight thershold is not given, it will be calculated according to the propagation weights):\n", "\n", "```-pt 0.01```\n", + "\n", "```-wt None```" ] },