Giter VIP home page Giter VIP logo

dep's Introduction

DEP package

Overview of the analysis

This package provides an integrated analysis workflow for robust and reproducible analysis of mass spectrometry proteomics data for differential protein expression or differential enrichment. It requires tabular input (e.g. txt files) as generated by quantitative analysis softwares of raw mass spectrometry data, such as MaxQuant or IsobarQuant. Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins. It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation. Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations. For scientists with limited experience in R, the package also entails wrapper functions that entail the complete analysis workflow and generate a report. Even easier to use are the interactive Shiny apps that are provided by the package.

Installation

Install and load the package:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("DEP")

library("DEP")

More information can be found in the vignette:

browseVignettes("DEP")

dep's People

Contributors

arnesmits avatar const-ae avatar ericloud avatar hpages avatar kayla-morrell avatar link-ny avatar nturaga avatar vobencha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dep's Issues

Adjusted P-values lower than raw P values

Hi Arne,

Thanks for the great solution for proteome analysis!

I'm finding that some of my more significant hits are showing lower P-values after adjustment (e.g. 1E-06 to 4E-13. The model I'm fitting is a 3 contrast model.

I dug into DEP's code and it seems it uses limma's default method for adjustment (BH). I've never had this issue with limma before, does DEP take the adjusted P-values from limma and further modify them somehow?

Best wishes,

Luke Dabin

Batch correction?

I have proteomics data from three cell lines, each of which was exposed to two conditions (control and a drug). I've followed your vignette through normailzation and differential enrichment analysis, and have plotted a PCA, which shows major differences between cell lines and much smaller differences between conditions. I'm concerned that any condition-level differences are swamped by the cell-line differences, and in fact I have no significant proteins.

How do you suggest that I deal with this? In differential expression analysis using DESeq2 or limma I would simply add a model term to indicate the cell line of each sample. Can I do something similar in DEP? Or should I rather do an explicit batch correction using (eg) combat and then do a DEA on the residuals?

Thanks,
Patrick Turko

Spearman correlation with plot_cor?

Is there any way to substitute Spearman correlation method for the default Pearson correlation method? I can't seem to figure out which columns are being pulled from the SummarizedExperiment object to construct the underlying correlation matrix. Therefore I can't build out my own correlation matrix. Any help would be appreciated.

plot_heatmap function not working as data_results gives 0 significant proteins

I believe there's a bug in the DEP package, as I manage to get as far as the PCA, and am able to plot individual proteins as bar plots which show significance, but the data_results %>% filter(significant) %>% nrow() line results in 0, and when I run the plot_heatmap function
plot_heatmap(dep, type = "centered", kmeans = TRUE,
k = 6, col_limit = 4, show_row_names = FALSE,
indicate = c("condition", "replicate"))
I get: Error in sample.int(m, k) : invalid first argument

Please can someone help?

Different p-value in each independent run?

I use test_diff to conduct Differential enrichment analysis, I had impute my data using MLE methods in advance. I found for the same protein, the p-values derived from different runs were divergent from each other, as well as the numbers of significant proteins. Is it normal?
The figures demonstrate part of the results from two runs, one contains 19 differentially expressed proteins while the other contains only 3.
image
image

test_diff returns p-values in wrong order

Hi Arne,

first thanks for the helpful package.

I was recently playing around with the test_diff function, comparing different imputation methods, when I noticed it sometimes returns the p-values in the wrong order. Below I have created a small reproducible example, to show the problem:

set.seed(1)
suppressPackageStartupMessages({
  library(SummarizedExperiment)
  library(DEP)
  library(tidyverse)
})

syn_data <- matrix(c(
  rep(42, times=6),
  c(1:3, 11:13),
  c(5:7, 8:6)), ncol=6, byrow = TRUE)


# This is important, because if the row names are sorted
# alphabetically there is no error!
rownames(syn_data) <- paste0("name", sample(1:nrow(syn_data)))
colnames(syn_data) <- c(paste0("cond_a-", 1:3), paste0("cond_b-", 1:3))
syn_data
#>       cond_a-1 cond_a-2 cond_a-3 cond_b-1 cond_b-2 cond_b-3
#> name2       42       42       42       42       42       42
#> name3        1        2        3       11       12       13
#> name1        5        6        7        8        7        6

syn_exp_df <- data.frame(sample=colnames(syn_data)) %>%
  mutate(label=sample) %>%
  separate(sample, into=c("condition", "replicate"), remove=FALSE, sep="-")

syn_row_df <- data.frame(name=rownames(syn_data), ID=seq_len(nrow(syn_data)))

syn_se <- SummarizedExperiment(syn_data, colData=syn_exp_df, rowData=syn_row_df)


syn_res <- test_diff(syn_se, type="manual", test="cond_a_vs_cond_b")
#> Tested contrasts: cond_a_vs_cond_b
#> Warning in fdrtool::fdrtool(res$t, plot = FALSE, verbose = FALSE): There
#> may be too few input test statistics for reliable FDR calculations!
#> Warning: Censored sample for null model estimation has only size 2 !
rowData(syn_res)
#> DataFrame with 3 rows and 7 columns
#>           name        ID cond_a_vs_cond_b_CI.L cond_a_vs_cond_b_CI.R
#>       <factor> <integer>             <numeric>             <numeric>
#> name2    name1         3     -3.12424921339827      1.12424921339827
#> name3    name2         1  -0.00823518949045511   0.00823518949045511
#> name1    name3         2     -12.1242492133983     -7.87575078660174
#>       cond_a_vs_cond_b_diff cond_a_vs_cond_b_p.adj cond_a_vs_cond_b_p.val
#>                   <numeric>              <numeric>              <numeric>
#> name2                    -1      0.239304093291693      0.268028898798365
#> name3                     0      0.666666666666667                      1
#> name1                   -10   4.44089209850063e-16   0.000140843608180863
assay(syn_res)
#>       cond_a-1 cond_a-2 cond_a-3 cond_b-1 cond_b-2 cond_b-3
#> name2       42       42       42       42       42       42
#> name3        1        2        3       11       12       13
#> name1        5        6        7        8        7        6

Created on 2018-11-15 by the reprex package (v0.2.1)

As you can see the rownames of rowData(syn_res) do not match the name column of that dataframe and the p-value of 1, which I would expect in the first row where each element if 42, appears in the second row.

From what I understand the problem is that in the test_diff() function you use the merge method, but don't set sort=FALSE.

Best, Constantin

How to extract proteins from PCs in plot_pca in DEP package

I am using DEP package to analyze proteomics data. I did PCA for my samples (see the following plot) and wish to extract proteins in PC1 for further analysis. However, the objects x and y generated by the following code do not contain the information of the principal component (only the coordinates). May I ask for a solution?

Screenshot 2023-03-22 at 00 22 16

x <- plot_pca(dep_MDA231, x = 1, y = 2, n = 500, point_size = 4,plot = T)
y <- plot_pca(dep_MDA231, x = 1, y = 2, n = 500, point_size = 4,plot = F)

sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] magick_2.7.3                DEP_1.20.0                  forcats_1.0.0              
 [4] stringr_1.5.0               dplyr_1.1.0                 purrr_1.0.1                
 [7] readr_2.1.4                 tidyr_1.3.0                 tibble_3.1.8               
[10] ggplot2_3.4.1               tidyverse_1.3.2             SummarizedExperiment_1.28.0
[13] Biobase_2.58.0              GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
[16] IRanges_2.32.0              S4Vectors_0.36.1            BiocGenerics_0.44.0        
[19] MatrixGenerics_1.10.0       matrixStats_0.63.0         

loaded via a namespace (and not attached):
  [1] googledrive_2.0.0      colorspace_2.1-0       rjson_0.2.21          
  [4] ellipsis_0.3.2         circlize_0.4.15        XVector_0.38.0        
  [7] GlobalOptions_0.1.2    fs_1.6.1               clue_0.3-64           
 [10] rstudioapi_0.14        farver_2.1.1           mzR_2.32.0            
 [13] affyio_1.68.0          DT_0.27                fansi_1.0.4           
 [16] mvtnorm_1.1-3          lubridate_1.9.2        xml2_1.3.3            
 [19] codetools_0.2-18       ncdf4_1.21             doParallel_1.0.17     
 [22] impute_1.72.3          jsonlite_1.8.4         broom_1.0.3           
 [25] cluster_2.1.4          vsn_3.66.0             dbplyr_2.3.0          
 [28] png_0.1-8              shinydashboard_0.7.2   shiny_1.7.4           
 [31] BiocManager_1.30.19    compiler_4.2.2         httr_1.4.4            
 [34] backports_1.4.1        fastmap_1.1.0          assertthat_0.2.1      
 [37] Matrix_1.5-1           gmm_1.7                gargle_1.3.0          
 [40] limma_3.54.1           cli_3.6.0              later_1.3.0           
 [43] htmltools_0.5.4        tools_4.2.2            gtable_0.3.1          
 [46] glue_1.6.2             GenomeInfoDbData_1.2.9 affy_1.76.0           
 [49] Rcpp_1.0.10            MALDIquant_1.22        cellranger_1.1.0      
 [52] vctrs_0.5.2            preprocessCore_1.60.2  iterators_1.0.14      
 [55] tmvtnorm_1.5           rvest_1.0.3            mime_0.12             
 [58] timechange_0.2.0       lifecycle_1.0.3        XML_3.99-0.13         
 [61] googlesheets4_1.0.1    zoo_1.8-11             zlibbioc_1.44.0       
 [64] MASS_7.3-58.1          scales_1.2.1           MSnbase_2.24.2        
 [67] promises_1.2.0.1       pcaMethods_1.90.0      hms_1.1.2             
 [70] ProtGenerics_1.30.0    sandwich_3.0-2         parallel_4.2.2        
 [73] RColorBrewer_1.1-3     ComplexHeatmap_2.14.0  gridExtra_2.3         
 [76] stringi_1.7.12         foreach_1.5.2          BiocParallel_1.32.5   
 [79] shape_1.4.6            rlang_1.0.6            pkgconfig_2.0.3       
 [82] bitops_1.0-7           imputeLCMD_2.1         mzID_1.36.0           
 [85] lattice_0.20-45        labeling_0.4.2         htmlwidgets_1.6.1     
 [88] tidyselect_1.2.0       norm_1.0-10.0          plyr_1.8.8            
 [91] magrittr_2.0.3         R6_2.5.1               generics_0.1.3        
 [94] DelayedArray_0.24.0    DBI_1.1.3              pillar_1.8.1          
 [97] haven_2.5.1            withr_2.5.0            MsCoreUtils_1.10.0    
[100] RCurl_1.98-1.10        modelr_0.1.10          crayon_1.5.2          
[103] fdrtool_1.2.17         utf8_1.2.3             tzdb_0.3.0            
[106] GetoptLong_1.0.5       grid_4.2.2             readxl_1.4.2          
[109] reprex_2.0.2           digest_0.6.31          xtable_1.8-4          
[112] httpuv_1.6.8           munsell_0.5.0

Interaction term

Thanks for the great tool.

Is there an easy way to add an interaction term to the analysis?

For example:
data_diff <- test_diff(data_imp, type = "all", design_formula = formula(~ 0 + condition + genotype + condition:genotype))

Wrong results if the data is not sorted alphabetically

During the standard analysis, filter_missval() silently sorts the rowData alphabetically. If we skip this step, this results in a rowData where the rownames and the name column contain the same information, but add_rejections() later sorts on the name column without reordering rownames, getting values that do not match. This creates problems later on when functions such as single_plot() use subset(), which work on rownames, then tries to use the name column.

For example, with the steps from ?add_rejections:

# Load example
data <- UbiLength
data <- data[data$Reverse != "+" & data$Potential.contaminant != "+",]
data_unique <- make_unique(data, "Gene.names", "Protein.IDs", delim = ";")

# Make SummarizedExperiment
columns <- grep("LFQ.", colnames(data_unique))
exp_design <- UbiLength_ExpDesign
se <- make_se(data_unique, columns, exp_design)

# Filter, normalize and impute missing values
# filt <- filter_missval(se, thr = 0)  # <- no filtering step: no sorting!
norm <- normalize_vsn(se)
imputed <- impute(norm, fun = "MinProb", q = 0.01)

# Test for differentially expressed proteins
diff <- test_diff(imputed, "control", "Ctrl")
dep <- add_rejections(diff, alpha = 0.05, lfc = 1)

Then the resulting object has wrong names:

SummarizedExperiment::rowData(dep, use.names = TRUE) |>
  as.data.frame() |> head() |> dplyr::select(name, ID)
#>             name       ID
#> RBM47   A7E2Y1-3 A7E2Y1-3
#> UBA6        AAAS   Q9NRG9
#> ILVBL      AAMDC Q9H7C9-3
#> FAM92A1     AAR2   Q9Y312
#> ACO2       ABCB6   H7BXK9
#> RPRD1B     ABCB7 O75027-3


plot_single(dep, proteins = c("A7E2Y1-3","AAAS"))
#> Warning: 6 parsing failures.
#> row col           expected actual
#>   1  -- value in level set  ACTR3
#>   2  -- value in level set  ACTR3
#>   3  -- value in level set  ACTR3
#>   4  -- value in level set  TUFM 
#>   5  -- value in level set  TUFM 
#> ... ... .................. ......
#> See problems(...) for more details.

For users: always use filter_missval(), even with a high threshold.

Adding an option to output loadings plot in plot_pca()

Hi,

I've noticed that there was no option to view a loadings plot in the plot_pca() function, so I have been working on making some minor changes to the code to support this feature:

  • A new loadings parameter has been added to the plot_pca() function.
  • When loadings = TRUE, a ggplot object or dataframe will be the output of plot_pca() depending on whether the plot parameter is true or false.
  • geom_point is not used for the loadings plot; instead, label=TRUE regardless of user settings when calling plot_pca()
  • Changes have been made to the documentation comments.
  • Added a line in to test_8_plot_functions_explore.R to ensure that loadings = TRUE and `plot = TRUE' returns a ggplot object.

Arguments Missing

ExpDesign.txt

This is a question and not necessarily an issue.I have encountered some problems based on my lack of understanding whilst using your package.I have zealously followed the instructions in the vignette and it seems as if some of the functions are set to expect specific column names with inflexibility. I have an output file from a LFQ MaxQuant analysis and some of the column names you mention do not exist and this has disallowed me from creating a summarized experiment object.

My question is, how can I create a summarized experiment object to allow me to continue with my analysis ?

Below is a documentation of some errors that I have encountered.

dataset <- read_csv(file)
Parsed with column specification:
cols(
`Protein IDs Majority protein IDs Peptide counts (all) Peptide counts (razor+unique) Peptide counts (unique) Fasta headers Number of proteins Peptides Razor + unique peptides Unique peptides Peptides 6 Peptides 62 Peptides 63 Peptides 8 Peptides 82 Peptides 83 Peptides 9 Peptides 92 Peptides 93 Peptides G Peptides G2 Peptides G3 Razor + unique peptides 6 Razor + unique peptides 62 Razor + unique peptides 63 Razor + unique peptides 8 Razor + unique peptides 82 Razor + unique peptides 83 Razor + unique peptides 9 Razor + unique peptides 92 Razor + unique peptides 93 Razor + unique peptides G Razor + unique peptides G2 Razor + unique peptides G3 Unique peptides 6 Unique peptides 62 Unique peptides 63 Unique peptides 8 Unique peptides 82 Unique peptides 83 Unique peptides 9 Unique peptides 92 Unique peptides 93 Unique peptides G Unique peptides G2 Unique peptides G3 Sequence coverage [%] Unique + razor sequence coverage [%] Unique sequence coverage [%] Mol. weight [kDa] Sequence length S...
)

dataset<-filter(dataset,Potential.contaminant != "+")
Error in filter_impl(.data, quo) :
Evaluation error: object 'Potential.contaminant' not found.

dataset$Gene.names %>% duplicated() %>% any()
[1] FALSE
Warning message:
Unknown or uninitialised column: 'Gene.names'.

data_unique <- make_unique(data, "Gene.names", "Protein.IDs", delim = ";")
Error: proteins is not a data frame

data_se <- make_se(dataset, LFQ_columns, experimental_design)
Error: 'name' and/or 'ID' columns are not present in 'dataset'.
Run make_unique() to obtain the required columns

data_unique <- make_unique(dataset,"Protein.IDs", delim = ";")
Error in eval(assertion, env) :
argument "ids" is missing, with no default

data_se_parsed <- make_se_parse(dataset, LFQ_columns)
Error: 'name' and/or 'ID' columns are not present in 'dataset'.
Run make_unique() to obtain the required columns

sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] DEP_1.0.1 bindrcpp_0.2 readr_1.1.1 dplyr_0.7.4
[5] rpart.plot_2.1.2 rpart_4.1-11 SummarizedExperiment_1.8.0 DelayedArray_0.4.1
[9] matrixStats_0.52.2 Biobase_2.38.0 GenomicRanges_1.30.0 GenomeInfoDb_1.14.0
[13] IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0

loaded via a namespace (and not attached):
[1] nlme_3.1-131 ProtGenerics_1.10.0 bitops_1.0-6 doParallel_1.0.11 RColorBrewer_1.1-2
[6] MSnbase_2.4.0 tools_3.4.1 R6_2.2.2 DT_0.2 affyio_1.48.0
[11] tmvtnorm_1.4-10 lazyeval_0.2.1 colorspace_1.3-2 GetoptLong_0.1.6 mnormt_1.5-5
[16] compiler_3.4.1 MassSpecWavelet_1.44.0 preprocessCore_1.40.0 sandwich_2.4-0 scales_0.5.0
[21] mvtnorm_1.0-6 psych_1.7.8 affy_1.56.0 stringr_1.2.0 digest_0.6.12
[26] foreign_0.8-69 XVector_0.18.0 pkgconfig_2.0.1 htmltools_0.3.6 limma_3.34.2
[31] htmlwidgets_0.9 rlang_0.1.4 GlobalOptions_0.0.12 impute_1.52.0 shiny_1.0.5
[36] BiocInstaller_1.28.0 shape_1.4.3 bindr_0.1 zoo_1.8-0 mzID_1.16.0
[41] BiocParallel_1.12.0 RCurl_1.95-4.8 magrittr_1.5 GenomeInfoDbData_0.99.1 MALDIquant_1.17
[46] Matrix_1.2-10 Rcpp_0.12.13 munsell_0.4.3 imputeLCMD_2.0 vsn_3.46.0
[51] stringi_1.1.6 MASS_7.3-47 zlibbioc_1.24.0 plyr_1.8.4 grid_3.4.1
[56] shinydashboard_0.6.1 lattice_0.20-35 splines_3.4.1 multtest_2.34.0 circlize_0.4.2
[61] hms_0.4.0 mzR_2.12.0 xcms_3.0.0 ComplexHeatmap_1.17.1 rjson_0.2.15
[66] reshape2_1.4.2 codetools_0.2-15 XML_3.98-1.9 glue_1.2.0 pcaMethods_1.70.0
[71] data.table_1.10.4-3 httpuv_1.3.5 foreach_1.4.3 gtable_0.2.0 RANN_2.5.1
[76] purrr_0.2.4 tidyr_0.7.2 norm_1.0-9.5 assertthat_0.2.0 ggplot2_2.2.1
[81] mime_0.5 xtable_1.8-2 broom_0.4.3 survival_2.41-3 tibble_1.3.4
[86] iterators_1.0.8 gmm_1.6-1
proteinGroups.txt

non-unique values when setting 'row.names

I am getting the following error even though I have followed the introduction

Error in .rowNamesDF<-(x, value = value) :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘baseline_1_1’, ‘baseline_1_3’, ‘BCG_1_1’, ‘BCG_1_2’, ‘PPD_3_2’, ‘PPD_3_3’

my experimental design and protein group files are here. I am disperately looking forward to your assistance since this is the last analysis to go into my thesis then I will just graduate

duplicate 'row.names' are not allowed

If I have ("LFQ.intensity.T163PPD" "LFQ.intensity.T164B1")

my experimental design should look like this

label condition replicate
T163PPD PPD 1

Is this correct but it keeps on giving an error since I have over 20 samples from the proteingroup file

duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names'

filter_proteins based on proportion in one group

Hi, I noticed that in your implementation of filter_proteins, there are three types, "complete", "condition" and "fraction". I was wondering is it possible to filter based on the proportion of missing values in at least one condition, instead of the actual "thr" used in "condition".

For example, I want to keep proteins that appear 50% (without missing values) in at least one condition. Is there a way to implement this using DEP, or could you add this characteristic?

Thanks a lot!

Error in .ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, : No finite residual standard deviations

Hello,

I am trying to run the DEP package on a MaxQuant output for a paper (x) that used 21 different treatments on all 5 patient samples . I ran MaxQuant as if each sample was its own experiment and generated a proteinGroups.txt file lfq-proteinGroups.txt.

This was the code I used to create the design object. I wasn't sure of how to fill out the condition and replicate since I want to compare all of them against each other equally.

#create experiment design object 
label <- c("V1", "V2", "V3", "V4", "V5")
condition <- c("1","2","3","4","5")
replicate <- c("1","2","3","4","5")
experimental_design <- data.frame(label, condition, replicate)

When going through the DEP package example (x), I ran into an issue when testing all possible comparisons of samples in the differential enrichment analysis portion of the tutorial.

type = "all"

# Test all possible comparisons of samples
data_diff_all_contrasts <- test_diff(data_imp, type = "all", control = NULL)

Error in .ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, :
No finite residual standard deviations

type = "manual"

# Test manually defined comparisons
data_diff_manual <- test_diff(data_norm, type = "manual", 
                              test = c("X1", "X2"))

Error in .ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim, :
No finite residual standard deviations

I am unsure of what mistake I made and if it has to do with my use of the DEP package or the MaxQuant output. Please let me know where I went wrong.

complet code

# Loading a package required for data handling
library("dplyr")

# Read raw file
raw = read.delim("~/proteomics-practice/data/lfq-proteinGroups.txt", stringsAsFactors = FALSE, colClasses = "character")

# The data is provided with the package
data <- raw

# We filter for contaminant proteins and decoy database hits, which are indicated by "+" in the columns "Potential.contaminants" and "Reverse", respectively. 
data <- filter(data, Reverse != "+", Potential.contaminant != "+")

# Are there any duplicated gene names?
data$Gene.names %>% duplicated() %>% any()

# Make a table of duplicated gene names
data %>% group_by(Gene.names) %>% summarize(frequency = n()) %>% 
  arrange(desc(frequency)) %>% filter(frequency > 1)

#make_unique(): generates unique identifiers for a proteomics dataset based on "name" and "id" columns.
data_unique <- make_unique(data, "Gene.names", "Protein.IDs", delim = ";")

# Are there any duplicated names?
data$name %>% duplicated() %>% any()

#get LFQ column numbers
LFQ_columns <- grep("LFQ.", colnames(data_unique)) 

#make LFQ values numeric
data_unique$LFQ.intensity.V1 <- as.numeric(data_unique$LFQ.intensity.V1)
data_unique$LFQ.intensity.V2 <- as.numeric(data_unique$LFQ.intensity.V2)
data_unique$LFQ.intensity.V3 <- as.numeric(data_unique$LFQ.intensity.V3)
data_unique$LFQ.intensity.V4 <- as.numeric(data_unique$LFQ.intensity.V4)
data_unique$LFQ.intensity.V5 <- as.numeric(data_unique$LFQ.intensity.V5)

#create experiment design object 
label <- c("V1", "V2", "V3", "V4", "V5")
condition <- c("1","2","3","4","5")
replicate <- c("1","2","3","4","5")
experimental_design <- data.frame(label, condition, replicate)

#Generate a SummarizedExperiment object using an experimental design
data_se <- make_se(data_unique, LFQ_columns, experimental_design)

# Generate a SummarizedExperiment object by parsing condition information from the column names
LFQ_columns <- grep("LFQ.", colnames(data_unique)) # get LFQ column numbers
data_se_parsed <- make_se_parse(data_unique, LFQ_columns)

# Let's have a look at the SummarizedExperiment object
data_se

# Plot a barplot of the protein identification overlap between samples
plot_frequency(data_se)

# Filter for proteins that are identified in all replicates of at least one condition
data_filt <- filter_missval(data_se, thr = 0)

# Less stringent filtering:
# Filter for proteins that are identified in 2 out of 3 replicates of at least one condition
data_filt2 <- filter_missval(data_se, thr = 1)

# Plot a barplot of the number of identified proteins per samples
plot_numbers(data_filt2)

# Plot a barplot of the protein identification overlap between samples
plot_coverage(data_filt)

# Normalize the data
data_norm <- normalize_vsn(data_filt)

# Visualize normalization by boxplots for all samples before and after normalization
plot_normalization(data_filt, data_norm)

# Impute missing data using random draws from a Gaussian distribution centered around a minimal value (for MNAR)
data_imp <- impute(data_norm, fun = "MinProb", q = 0.01)

# Impute missing data using random draws from a manually defined left-shifted Gaussian distribution (for MNAR)
data_imp_man <- impute(data_norm, fun = "man", shift = 1.8, scale = 0.3)

# Impute missing data using the k-nearest neighbour approach (for MAR)
data_imp_knn <- impute(data_norm, fun = "knn", rowmax = 0.9)

# Plot intensity distributions before and after imputation
plot_imputation(data_norm, data_imp)

# Test all possible comparisons of samples
data_diff_all_contrasts <- test_diff(data_imp, type = "all", control = NULL)

TMT analysis

Hey!

I'm a newbie in R and I am trying to use DEP for the first time and I have a couple of questions:

  1. I was wondering if it is possible to use DEP with an output from MaxQuant. I know the workflow for TMT analysis requires the output from IsobarQuan, but I imagined that if I can get the same columns in my MQ output it could work. If yes, what columns would I need to change/add? SO far I tried the following columns but it doesn't seem to work (see #2 comment):
 [1] "Protein.IDs"     "Protein.names"   "Gene.names"     
 [4] "FC_126C"         "FC_128C"         "FC_129C"        
 [7] "FC_127B"         "FC_128B"         "FC_130B"        
[10] "PSM"             "Unique.peptides"
  1. I'm trying to follow the "Introduction to DEP" code for TMT and I get the following error when using the function 'TMT':

Error in make.unique(ifelse(name == "" | is.na(name), ID, name)) :
'names' must be a character vector

  1. I have two treatments and a control condition in my dataset. Can DEP handle it? Or does it has to be max 2 conditions (control and treated)?

Thank you,

Debs

Remove 1 sample and all pvalue adj bump to 1.

Hi DEP's authors,

I have some trouble with the p-value adj. compute by fdrtool.

In my design, I have 7 samples of cond1 and 19 samples of cond2.
If I use all these samples, DEP works with 1844 proteins:

  • 154 proteins are significant
  • the protein x have:
    • pvalue: 2.03691168653476E-06
    • pvalue adj: 3.32E-13

If I remove 1 sample in my cond2 (7 vs 18), DEP works with 1893:

  • no proteins are significant, all pvalue adj bump to 1.
  • the protein x have:
    • pvalue: 4.14508026818263E-06
    • pvalue adj: 1

It doesn't seem to be a bug, but I'm quite surprise that removing one sample change the p-value adj so much!
So I'm wondering:

  1. Why you have choose this method compare to p.adjust() and the qvalue package ?
  2. If the fdrtool package is appropriate in all case, to calculate a p-value adj ?

If the answer of 2) is not:
3) Have you an idea why my case doesn't work ?

Bonus: is It interesting to offer the possibility to change the method ? (I can work on it).

Thanks for your work and your help,
Eric.

test_gsea(): EnrichR website not responding

Hi!

I've been running into an issue with the test_gsea() function some weeks ago. The function aborts with the error message "EnrichR website not responding". The website itself is up and running. The function used to work just fine a months or two ago and suddenly stopped working. Is this a known issue?

Appreciate everyone's time taken to help with this query!

Best
Philipp

TMT analysis question used the DEP

when I use the DEP to analyze the data form maxquant result, however, in the description of DEP says "TMTdata <- example_data
Exp_Design <- example_Exp_Design
" that I still cannot find. So, I don't know how to deal dataset of TMT result dealt by maxquant. Could you give me some suggestions for this question?

Error on make_se

Hello there,

First of all, thanks for the amazing tool!

I'm trying to load some LFQ data into DEP following the instructions in the vignette. All going ok until I try to make the summarized experiment when I get the following error:

 Error: specified 'columns' should be numeric
 Run make_se_parse() with the appropriate columns as argument

I checked the class of each parameter needed for make_se, and they are in the correct classes!

Here is the proteinGroup file and the ExperimentalDesign files.

This is my code:

library("DEP")
library("dplyr")

wd <- "/home/rtorreglosa/Documents/LFQ_analysis/"
setwd(wd)

data_c <- read.table(file = "LFQ_DEP_analysis_adjusted.txt", header=T, sep="\t", stringsAsFactors = FALSE)

data_c_filt <- filter(data_c, Potential.contaminant != "+")

data_c_filt$Gene.names %>% duplicated() %>% any()

data_c %>% group_by(Gene.names) %>% summarize(frequency = n()) %>% 
arrange(desc(frequency)) %>% filter(frequency > 1)

data_c_unique <- make_unique(data_c_filt, "Gene.names", "Protein.IDs", delim = ";")

data_c_filt$name %>% duplicated() %>% any()

data_c_filt$Razor...unique.peptides

LFQ_columns_c <- grep("LFQ.", colnames(data_c_unique))

experimental_design_c <- read.table(file = "LFQ_Exp_Design.txt", header=T, sep="\t", stringsAsFactors = FALSE)

data_c_se <- make_se(data_c_unique, LFQ_columns_c, experimental_design_c)

Any ideas on how to solve this issue are welcome!

Thanks!

Ramon

Cannot download DEP package

The downloaded binary packages are in
/var/folders/n0/hj3srgsx5sd4p8_8p8_5v8wm0000gn/T//RtmpOb525A/downloaded_packages

if (!requireNamespace("BiocManager", quietly=TRUE))

  • install.packages("BiocManager")

BiocManager::install("DEP")
'getOption("repos")' replaces Bioconductor standard repositories, see
'help("repositories", package = "BiocManager")' for details.
Replacement repositories:
CRAN: https://cran.rstudio.com/
Bioconductor version 3.18 (BiocManager 1.30.22), R 4.3.1 (2023-06-16)
Old packages: 'htmlwidgets', 'MSnbase'
Update all/some/none? [a/s/n]:
install.packages("DEP")
Update all/some/none? [a/s/n]:
a
trying URL 'https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/htmlwidgets_1.6.2.tgz'
Content type 'application/x-gzip' length 803888 bytes (785 KB)
==================================================
downloaded 785 KB

trying URL 'https://bioconductor.org/packages/3.18/bioc/bin/macosx/big-sur-arm64/contrib/4.3/MSnbase_2.27.1.tgz'
Content type 'application/x-gzip' length 8598247 bytes (8.2 MB)

downloaded 8.2 MB

The downloaded binary packages are in
/var/folders/n0/hj3srgsx5sd4p8_8p8_5v8wm0000gn/T//RtmpOb525A/downloaded_packages
Warning message:
package(s) not installed when version(s) same as or greater than current; use force = TRUE to re-install: 'DEP'

install.packages("DEP", force = TRUE)
Error in install.packages : Updating loaded packages

Restarting R session...

install.packages("DEP", force = TRUE)
Warning in install.packages :
package ‘DEP’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

filter_missval function

Hi! I have a question. I don't completely understand how the function filter_missval() works. Specifically, this argument 'thr' despite the explanation shown in the source code: The dataset is filtered for proteins that have a maximum of 'thr' missing values in at least one condition.

Can you be more specific, please?

Because if I run thr=0, I should see the same number of proteins for all my conditions, right? I didn't see that.
Thanks!

Error when making SE

Hi Arne,

Not sure if this is an issue for here or for the the bioconductor forum, so let me know if I should re-post there. I'm trying to load some MaxQuant data into DEP following the instructions in the vignette. All going ok until I try to make the summarized experiment when I get the following error:

data_se<- make_se(data_unique, LFQ_columns, experimental_design)
Error: words is not a character vector

I've checked my input files and I can't see any formatting differences between them and the test data, apart from the fact that I'm using the full proteinGroups file.

Here is the input proteinGroups file, the experimental design, and a markdown with my code.

Any ideas as to what's happening are welcome!

Thanks!

Sophie

plot_frequency function run-off on plot

Great package, thanks for your time. New to this so apologies of this is a rookie mistake or mis-posted.

I'm loading a rather large dataset of multiple conditions with >30 samples of LFQ data. When I load into plot_frequency I get cut off when using plot = TRUE and FALSE. Not sure what would be harder, but being able to modify width of the bars or plot a range of data would be nice,.

Either way, thanks.

EDIT: Clarified the issue.

Cannot install on mac

Hello,

I have some big challenges to install and load DEP. I have tried everything and listed below.

I used either BiocManager::install('DEP', type = 'source', dependencies = TRUE) or BiocManager::install('DEP'). The message is the following:

'getOption("repos")' replaces Bioconductor standard repositories, see
'?repositories' for details

replacement repositories:
    CRAN: https://cran.r-project.org/


Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.2 (2021-11-01)

Installing package(s) 'DEP'

also installing the dependencies ‘ncdf4’, ‘mzR’, ‘MSnbase’


Warning message in .inet_warning(msg):
“installation of package ‘ncdf4’ had non-zero exit status”
Warning message in .inet_warning(msg):
“installation of package ‘mzR’ had non-zero exit status”
Warning message in .inet_warning(msg):
“installation of package ‘MSnbase’ had non-zero exit status”
Warning message in .inet_warning(msg):
“installation of package ‘DEP’ had non-zero exit status”
Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done

I am using macOS Monterey Version 12.2.1
I am using R in the Jupyter notebook in an Anaconda environment. I had exactly the same environment in Windows and used DEP successfully.
I cannot install DEP using RStudio either in this macOS.
I tried to install using devtools through github and got the same error.
I installed Xcode and GCC and did not solve the problem.
I can use other R and Bioconductor packages with no problem.

Error in using interactive analysis using the DEP Shiny apps

Hi there,

I am facing an issue when trying to start the interactive session analysis.

library("DEP")

For LFQ analysis

run_app("LFQ")
Loading required package: shiny

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: ‘matrixStats’

The following object is masked from ‘package:dplyr’:

count

Attaching package: ‘MatrixGenerics’

The following objects are masked from ‘package:matrixStats’:

colAlls, colAnyNAs, colAnys, colAvgsPerRowSet,
colCollapse, colCounts, colCummaxs, colCummins,
colCumprods, colCumsums, colDiffs, colIQRDiffs,
colIQRs, colLogSumExps, colMadDiffs, colMads,
colMaxs, colMeans2, colMedians, colMins,
colOrderStats, colProds, colQuantiles, colRanges,
colRanks, colSdDiffs, colSds, colSums2,
colTabulates, colVarDiffs, colVars,
colWeightedMads, colWeightedMeans,
colWeightedMedians, colWeightedSds,
colWeightedVars, rowAlls, rowAnyNAs, rowAnys,
rowAvgsPerColSet, rowCollapse, rowCounts,
rowCummaxs, rowCummins, rowCumprods, rowCumsums,
rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
rowMadDiffs, rowMads, rowMaxs, rowMeans2,
rowMedians, rowMins, rowOrderStats, rowProds,
rowQuantiles, rowRanges, rowRanks, rowSdDiffs,
rowSds, rowSums2, rowTabulates, rowVarDiffs,
rowVars, rowWeightedMads, rowWeightedMeans,
rowWeightedMedians, rowWeightedSds,
rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall,
clusterEvalQ, clusterExport, clusterMap, parApply,
parCapply, parLapply, parLapplyLB, parRapply,
parSapply, parSapplyLB

The following objects are masked from ‘package:dplyr’:

combine, intersect, setdiff, union

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename,
cbind, colnames, dirname, do.call, duplicated,
eval, evalq, Filter, Find, get, grep, grepl,
intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, Position, rank, rbind, Reduce, rownames,
sapply, setdiff, sort, table, tapply, union,
unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

first, rename

The following objects are masked from ‘package:base’:

expand.grid, I, unname

Loading required package: IRanges

Attaching package: ‘IRanges’

The following objects are masked from ‘package:dplyr’:

collapse, desc, slice

Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages
'citation("pkgname")'.

Attaching package: ‘Biobase’

The following object is masked from ‘package:MatrixGenerics’:

rowMedians

The following objects are masked from ‘package:matrixStats’:

anyMissing, rowMedians

Attaching package: ‘shinydashboard’

The following object is masked from ‘package:graphics’:

box

Error: 'imputeMethods' is not an exported object from 'namespace:MSnbase'

The DEP library is perfectly loading after installation and MSnbase package is also installed.

The session info for the same:

sessionInfo(package = NULL)
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
[1] LC_CTYPE=en_IN.UTF-8 LC_NUMERIC=C LC_TIME=en_IN.UTF-8
[4] LC_COLLATE=en_IN.UTF-8 LC_MONETARY=en_IN.UTF-8 LC_MESSAGES=en_IN.UTF-8
[7] LC_PAPER=en_IN.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] shinydashboard_0.7.1 SummarizedExperiment_1.22.0 Biobase_2.52.0
[4] GenomicRanges_1.44.0 GenomeInfoDb_1.28.1 IRanges_2.26.0
[7] S4Vectors_0.30.0 BiocGenerics_0.38.0 MatrixGenerics_1.4.2
[10] matrixStats_0.60.1 tibble_3.1.4 dplyr_1.0.7
[13] shiny_1.6.0 DEP_1.14.0

loaded via a namespace (and not attached):
[1] ProtGenerics_1.24.0 bitops_1.0-7 doParallel_1.0.16 RColorBrewer_1.1-2
[5] MSnbase_2.18.0 tools_4.1.1 DT_0.18 utf8_1.2.2
[9] R6_2.5.1 affyio_1.62.0 tmvtnorm_1.4-10 DBI_1.1.1
[13] colorspace_2.0-2 GetoptLong_1.0.5 withr_2.4.2 tidyselect_1.1.1
[17] compiler_4.1.1 preprocessCore_1.54.0 Cairo_1.5-12.2 DelayedArray_0.18.0
[21] sandwich_3.0-1 scales_1.1.1 mvtnorm_1.1-2 readr_2.0.1
[25] affy_1.70.0 digest_0.6.27 XVector_0.32.0 pkgconfig_2.0.3
[29] htmltools_0.5.2 fastmap_1.1.0 limma_3.48.3 htmlwidgets_1.5.3
[33] rlang_0.4.11 GlobalOptions_0.1.2 impute_1.66.0 shape_1.4.6
[37] generics_0.1.0 zoo_1.8-9 mzID_1.30.0 BiocParallel_1.26.2
[41] RCurl_1.98-1.4 magrittr_2.0.1 GenomeInfoDbData_1.2.6 MALDIquant_1.20
[45] Matrix_1.3-4 Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0
[49] MsCoreUtils_1.4.0 imputeLCMD_2.0 lifecycle_1.0.0 vsn_3.60.0
[53] MASS_7.3-54 zlibbioc_1.38.0 plyr_1.8.6 grid_4.1.1
[57] promises_1.2.0.1 crayon_1.4.1 lattice_0.20-44 hms_1.1.0
[61] circlize_0.4.13 mzR_2.26.1 ComplexHeatmap_2.8.0 pillar_1.6.2
[65] rjson_0.2.20 codetools_0.2-18 XML_3.99-0.7 glue_1.4.2
[69] pcaMethods_1.84.0 BiocManager_1.30.16 httpuv_1.6.2 tzdb_0.1.2
[73] png_0.1-7 vctrs_0.3.8 foreach_1.5.1 tidyr_1.1.3
[77] gtable_0.3.0 purrr_0.3.4 clue_0.3-59 norm_1.0-9.5
[81] assertthat_0.2.1 ggplot2_3.3.5 mime_0.11 xtable_1.8-4
[85] later_1.3.0 ncdf4_1.17 iterators_1.0.13 gmm_1.6-6
[89] cluster_2.1.2 ellipsis_0.3.2

P adiusted bigger than p value?

Hi there,
I am attaching my code, my inputs and my results after using DEP.
I do not understand how for one condition, the p value (column E of file output "data_results.csv") is bigger than its correspondent p adjusted vale (column H), giving me at the end an inflated number of significant protein (column K).
Thank you

setwd("/media/mariano/Elements1/Jean/12022024_Jesus/")
library(DEP)
library(tidyverse)
library(data.table)
set.seed(123)
data <- read.table("Jesus 06-01-2024 proteinGroups.csv", sep = "\t", header = TRUE, stringsAsFactors = FALSE,dec=".")
#data <- dplyr::filter(data, Reverse != "+",is.na(Potential.contaminant))
data$Gene.names %>% duplicated() %>% any()
data %>% group_by(Gene.names) %>% summarize(frequency = n()) %>%
  arrange(desc(frequency)) %>% filter(frequency > 1)
data_unique <- make_unique(data, "Gene.names", "Protein.IDs", delim = ";")
data$name %>% duplicated() %>% any()
LFQ_data <- dplyr::select(data_unique,contains("LFQ"))
colnames(LFQ_data)<-gsub("LFQ.intensity.","",as.character(colnames(LFQ_data)))
colnames(LFQ_data)<-gsub("^0","",as.character(colnames(LFQ_data)))
LFQ_columns <- grep("LFQ", colnames(data_unique))
exp_design <- read.table(file = "Exp_design.csv", sep = "\t", header = T)
exp_design$label <- as.character(exp_design$label)
colnames(data_unique)<-gsub("LFQ.intensity.","",as.character(colnames(data_unique)))
colnames(data_unique)<-gsub("^0","",as.character(colnames(data_unique)))
data_se <- DEP::make_se(data_unique, columns=LFQ_columns, exp_design)
plot_frequency(data_se)
plot_numbers(data_se)
plot_coverage(data_se)
data_filt <- filter_missval(data_se, thr = 2)
plot_coverage(data_filt)
plot_numbers(data_filt)
data_norm <- normalize_vsn(data_filt)
plot_normalization(data_filt, data_norm)
plot_missval(data_filt)
plot_detect(data_filt)
data_imp <- impute(data_norm, fun = "MinProb", q = 0.01)
plot_imputation(data_norm, data_imp)
data_diff_all_contrasts <- test_diff(data_imp, type = "all")
dep <- add_rejections(data_diff_all_contrasts, alpha = 0.05)
plot_pca(dep, x = 1, y = 2, n = 7669, point_size = 3, label = F, indicate = "condition") +
  geom_text(label=exp_design$sample_name)
data_results <- get_results(dep)
data_results %>% filter(significant) %>% nrow()
colnames(data_results)
write.table(data_results,"data_results.csv", sep = "\t", col.names = T, row.names = F, quote = F)
plot_heatmap(dep, type = "centered", kmeans = F, 
             col_limit = 4, show_row_names = F,
             indicate = c("condition"), row_font_size = 3, column_labels = exp_design$sample_name)
plot_volcano(dep, contrast = "WT_vs_KO", label_size = 2, add_names = TRUE)
save.image(file="Data.RData")

data_results.csv
Jesus 06-01-2024 proteinGroups.csv
Exp_design.csv

Handling Technical and Biological Replicates

Hi, I have an experimental design as follows:

WT_1 WT_2 WT3 WT_4 KO_1 KO_2 KO_3 KO_4

*_1 and *_2 are biological replicates while *_3 and *_4 are technical replicates.

Please, can you explain how to handle technical and biological replicates in DEP?
I understand that I can utilise the option type="all" to make pairwise comparisons:
data_diff_all_contrasts <- test_diff(data_imp, type = "all")
I hav e done that but would that be enough to take into account the replicates or do I need add the design_formula option in the code?
If design_formula needs to be include please, can you explain how to do in my case of experimental design?
Thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.