Skip to content

Commit

Permalink
new tmt table
Browse files Browse the repository at this point in the history
  • Loading branch information
marinakiweler committed Feb 7, 2019
1 parent 75eaba1 commit 37af5a3
Show file tree
Hide file tree
Showing 18 changed files with 132 additions and 30 deletions.
2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,12 @@ Imports:
assertive.sets,
assertive.strings,
assertive.types,
dplyr,
jsonlite,
pathological,
rlist,
stringi,
tidyr,
magrittr
Depends:
R (>= 2.10)
Expand Down
34 changes: 24 additions & 10 deletions R/load_term_matching_table_function.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
#' @param instrument_list list of instruments in the raw file, get it with "instrument_names()".
#' @param origin_key Specifies which information is required.
#' If empty, all information is used.
#' "jpr" for the requirements of the Journal of Proteome Research.
#' "mcp" for the requirements of the Molecular and Cellular Proteomics.
#' "jpr_guidelines_ms" for the requirements of the Journal of Proteome Research.
#' "miape" for The Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative.
#'
#'
#' @return A table needed to extract the meta data with match_terms().
#'
#' @examples
#' term_matching_table <- create_term_match_table(
#' instrument_list = c("Thermo_EASY-nLC", "Q_Exactive_-_Orbitrap_MS"),
#' origin_key = "jpr")
#' origin_key = "jpr_guidelines_ms")
#'
#' @export

Expand All @@ -33,21 +33,35 @@ create_term_match_table <- function(instrument_list = c("Thermo_EASY-nLC", "Q Ex
#' @param origin_key Specifies which information is required.
#' If empty, all information is used.
#' "jpr_guidelines_ms" for the requirements of the Journal of Proteome Research.
#' "mcp_guidelines_ms" for the requirements of the Molecular and Cellular Proteomics.
#' "miape" for The Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative.
#'
#'
#' @return Table with tmt part for this instrument.

tmt_one_instrument <- function(instrument, origin_key = "")
tmt_one_instrument <- function(instrument, origin_key = "all")
{
instrument %<>% gsub(pattern = " ", replacement = "_" )

assertive.sets::assert_is_subset(instrument, names(MARMoSET::tmt_list))
assertive.properties::assert_is_non_empty(grep(MARMoSET::tmt_list[[instrument]][["origin"]], pattern = origin_key))

sub_index <- MARMoSET::tmt_list[[instrument]][["origin"]] %>%
grep(pattern = origin_key)
okey_is_ok <- function(x){ assertive.properties::has_rows(MARMoSET::tmt_list[[instrument]] %>%
tidyr::separate_rows(origin, sep=';') %>%
dplyr::filter(origin == origin_key)) ||
(origin_key == "" || origin_key == "all")}

rbind( MARMoSET::tmt_list[[instrument]][sub_index,]) %>%
return()
assertive.base::assert_engine( okey_is_ok, origin_key,
msg = "origin_key is not a valid key!",
severity = "stop",
what = "any")

sub_index <- MARMoSET::tmt_list[[instrument]]

if(origin_key != "" && origin_key != "all")
{
sub_index %<>%
tidyr::separate_rows(origin, sep=';') %>%
dplyr::filter(origin == origin_key)
}

return(sub_index)
}
4 changes: 2 additions & 2 deletions R/output_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#'
#' term_matching_table <- create_term_match_table(
#' instrument_list = c("Thermo_EASY-nLC", "Q_Exactive_-_Orbitrap_MS"),
#' origin_key = "jpr")
#' origin_key = "jpr_guidelines_ms")
#'
#' vector_of_group_tables <- match_terms(flat_json, term_matching_table)
#'
Expand Down Expand Up @@ -74,7 +74,7 @@ utils::globalVariables('.')
#'
#' term_matching_table <- create_term_match_table(
#' instrument_list = c("Thermo_EASY-nLC", "Q_Exactive_-_Orbitrap_MS"),
#' origin_key = "jpr")
#' origin_key = "jpr_guidelines_ms")
#'
#' vector_of_group_tables <- match_terms(flat_json, term_matching_table)
#'
Expand Down
4 changes: 2 additions & 2 deletions R/term_matching_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#'
#' term_matching_table <- create_term_match_table(
#' instrument_list = instrument_names(json, 1),
#' origin_key = "jpr")
#' origin_key = "jpr_guidelines_ms")
#'
#' vector_of_group_tables <- match_terms(flat_json, term_matching_table)
#'
Expand Down Expand Up @@ -52,7 +52,7 @@ match_terms <- function(flat_json, term_matching_table)
#'
#' term_matching_table <- create_term_match_table(
#' instrument_list = instrument_names(json, 1),
#' origin_key = "jpr")
#' origin_key = "jpr_guidelines_ms")
#'
#' meta_data_table <- one_group_match_terms(flat_json, term_matching_table, 1)
#'
Expand Down
4 changes: 2 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -75,12 +75,12 @@ flat_json <- flatten_json(json = json)
```

Since raw files include a huge amount of meta data and only several of this information is required by journals there is the need to sort out. Therefore a table linking which information is essential and where to find it in the flattened JSON is useful. Such a here refered to as 'term matching table' can be created with `term_matching_table()` by submitting two arguments. The first, `instrument_list` takes a vector with the names of the instruments represented in the JSON file.
The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr'` stands for the requirements of the Journal of Proteome Research, `'mcp'` chooses the requirements of the Molecular and Cellular Proteomics.
The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr_guidelines_ms'` stands for the requirements of the Journal of Proteome Research, `'miape'` for the Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative.

```{r}
term_matching_table <- create_term_match_table(
instrument_list = c('Thermo EASY-nLC', 'Q Exactive - Orbitrap_MS'),
origin_key = 'jpr')
origin_key = 'jpr_guidelines_ms')
```

The names of the instruments can be shown for each group with `instrument_names()` with the json and the group number as arguments.
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,12 @@ To allow an easier access to the JSON file it needs to be flattened, this works
flat_json <- flatten_json(json = json)
```

Since raw files include a huge amount of meta data and only several of this information is required by journals there is the need to sort out. Therefore a table linking which information is essential and where to find it in the flattened JSON is useful. Such a here refered to as 'term matching table' can be created with `term_matching_table()` by submitting two arguments. The first, `instrument_list` takes a vector with the names of the instruments represented in the JSON file. The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr'` stands for the requirements of the Journal of Proteome Research, `'mcp'` chooses the requirements of the Molecular and Cellular Proteomics.
Since raw files include a huge amount of meta data and only several of this information is required by journals there is the need to sort out. Therefore a table linking which information is essential and where to find it in the flattened JSON is useful. Such a here refered to as 'term matching table' can be created with `term_matching_table()` by submitting two arguments. The first, `instrument_list` takes a vector with the names of the instruments represented in the JSON file. The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr_guidelines_ms'` stands for the requirements of the Journal of Proteome Research, `'miape'` for the Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative..

``` r
term_matching_table <- create_term_match_table(
instrument_list = c('Thermo EASY-nLC', 'Q Exactive - Orbitrap_MS'),
origin_key = 'jpr')
origin_key = 'jpr_guidelines_ms')
```

The names of the instruments can be shown for each group with `instrument_names()` with the json and the group number as arguments.
Expand Down
Binary file modified data/tmt_list.rda
Binary file not shown.
6 changes: 3 additions & 3 deletions man/create_term_match_table.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/grapes-greater-than-grapes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/grapes-less-than-greater-than-grapes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/match_terms.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/one_group_match_terms.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/save_all_groups.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/save_group_table.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/tmt_one_instrument.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
84 changes: 84 additions & 0 deletions vignettes/manipulate_term_match_table.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: "Term matching tables"
author: "Marina Kiweler"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(MARMoSET)
```

## Structure

The term matching table contains links between the information needed to create the output table.
Some general term matching tables can be created by the function `create_term_matching_table()` by specifying the used instruments and the origin key.

This function falls back on the included list of data frames `tmt_list` where each table is designed to fit one instrument. By calling the function with only one instrument name it will just extract the specific data frame. If called with more than one instrument (ordinarily 2) it will bind the tables together.

```{r}
tmt_LC_pump <- create_term_match_table(
instrument_list = c("Thermo EASY-nLC"))
# no origin key
```

The table always consists of 6 columns whose order is important. The count of rows differ depending on the choosen requirements and the type of instrument.
```{r}
colnames(tmt_LC_pump)
```

The first collumn of the resulting table is always the `term`. It is a short and unique description of the row. While the second, `term_verbose` contains a more detailed and readable title.

```{r}
head(tmt_LC_pump[1:2])
```

The third collumn `origin` contains keys determining which requirements to satisfy. These keys were read in the function `create_term_matching_table()` and compared to the value of `origin_key`.

```{r}
head(tmt_LC_pump[3])
```
In the fourth column `handle_type` specifys how to interpret the fifth column `handle`.
"list_path" indicates that `handle` is a path in the flattened json.
"literal" leads to just copy the value of the handle.
With "parameter" the row stays empty because the information is not represented in the json.

```{r}
head(tmt_LC_pump[4:5])
```

The last collumn shows an example of which value the row could have.

```{r}
head(tmt_LC_pump[c(2,6)])
```

## Create an own combination of the available requirements

First extract a term matching table for the used instruments without specifying the origin key.

```{r}
full_tmt <- create_term_match_table(
instrument_list = c("Thermo EASY-nLC", "Q Exactive - Orbitrap_MS"))
# no origin key
```

Now it is possible to subset or delete some rows with r tools.


New rows can be added too.

It is necessary to fill in `term_verbose`, `handle_type` and `handle` in their specific column.
`term_verbose` should be a short desription for the collumn as it will show up as key in the final table.
If `handle` contains a string that just needs to be copied to the output table the collumn `handle_type` should contain `"literal"`.

If the row is meant to show an entry of the json `handle_type` needs to be `"list_path"` and `handle` should contain the path to the information in the flattened json after the group number. For example: The entry `flat_json[["Group.1.Instruments.Thermo EASY-nLC.InstrumentFriendName"]]` can be accessed with the value of handle = `"Instruments.Thermo EASY-nLC.InstrumentFriendName`"

4 changes: 2 additions & 2 deletions vignettes/using_MARMoSET.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,12 @@ flat_json <- flatten_json(json = json)
```

Since raw files include a huge amount of meta data and only several of this information is required by journals there is the need to sort out. Therefore a table linking which information is essential and where to find it in the flattened JSON is useful. Such a here refered to as 'term matching table' can be created with `term_matching_table()` by submitting two arguments. The first, `instrument_list` takes a vector with the names of the instruments represented in the JSON file.
The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr'` stands for the requirements of the Journal of Proteome Research, `'mcp'` chooses the requirements of the Molecular and Cellular Proteomics.
The second one, `origin_key` specifies which requirements should be met. If it stays empty, all journals are selected. While `'jpr_guidelines_ms'` stands for the requirements of the Journal of Proteome Research, `'miape'` for the Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative.

```{r}
term_matching_table <- create_term_match_table(
instrument_list = c('Thermo EASY-nLC', 'Q Exactive - Orbitrap_MS'),
origin_key = 'jpr')
origin_key = 'jpr_guidelines_ms')
```

The names of the instruments can be shown for each group with `instrument_names()` with the json and the group number as arguments.
Expand Down

0 comments on commit 37af5a3

Please sign in to comment.