Giter VIP home page Giter VIP logo

vite's Introduction

Please use github issues to report bugs and for feature requests

Installation

  1. In order to install this package, you must be able to compile C++ source packages on your system:

    • Windows: Install the Rtools package
    • MacOS: Install the XCode software from Apple that is freely available on the App Store. Depending on the specific version of XCode you are using you might also need to install the "Command Line Tools" package separately. Please refer to the XCode Documentation
    • Linux: Install GCC. Refer to the documentation of your distribution to find the specific package name
  2. Make sure you have a recent (>= 1.9) version of devtools installed

install.packages("devtools")
  1. install the flowCore package
source("http://bioconductor.org/biocLite.R")
biocLite("flowCore")
  1. install vite with the following command
devtools::install_github("ParkerICI/vite")

Usage

This package enables the analysis of single-cell data using graphs, both unsupervised graphs as well as scaffold maps. While the package is designed to work with clusters generated from the grappolo package, any kind of tabular input data can be used as input

The documentation of each function can be accessed directly within R. The following snippets demonstrate typical usage. Please refer to the full documentation for a complete breakdown of all the options

Creating an unsupervised graph

# Use as input files that have been generated using grappolo
input.files <- c("A.clustered.txt", "B.clustered.txt")

# Optional: Define a table of sample-level metadata. All the nodes derived from the corresponding cluster file will
# have vertex properties corresponding to this metadata ("response" and "pfs" in this example)
metadata.tab <- data.frame(filename = input.files, response = c("R", "NR"), pfs = c(12, 7))

# Define which columns contain variables that are going to be used to calculate similarities between the nodes
col.names <- c("foo", "bar", "foobar")


# The clusters in each one of the input files will be pooled together in a single graph
# This function also performs graph clustering by community detection. The community assignments are contained in
# the "community_id" vertex property of the resulting graph
G <- vite::get_unsupervised_graph_from_files(input.files, metadata.tab = metadata.tab, 
            metadata.filename.col = "filename", col.names = col.names, filtering.threshold = 15)

# Write the resulting graph in graphml format. 
vite::write_graph(G, "unsupervised.graphml")

By default the get_unsupervised_graph_from_files function will also process the data contained in the rds files generated by the grappolo package. A folder called clusters_data will be created, containing a sub-folder for each sample. Each subfolder contains multiple rds files, one for each cluster, containing the data for individual cells in that cluster. This data is used for visualization by the panorama package

Running a Scaffold analysis

This code snippet demonstrates how to construct Scaffold maps. This assumes that the data for the landmark nodes, i.e. the gated populations, is in a subfolder called gated. The gated populations have to be provided as single FCS files (one for each population). The software will split the name of the FCS files using "_" as separator and the last field will be used as the population name. For instance if a file is called foo_bar_foobar_Bcells.fcs the corresponding population will be called Bcells in the Scaffold result.

# Use as input files that have been generated using grappolo
input.files <- c("A.clustered.txt", "B.clustered.txt")

# Define which columns contain variables that are going to be used to calculate similarities between the nodes
col.names <- c("foo", "bar", "foobar")

# Load the data for the landmarks
landmarks.data <- load_landmarks_from_dir("gated/", asinh.cofactor = 5, transform.data = T)
    
# Run the analysis. By default results will be save in a directory called "scaffold_result"
run_scaffold_analysis(input.files, ref.file = input.files[1], 
                        landmarks.data = landmarks.data, col.names = col.names)

By the default the output of the analysis will be saved in a folder called scaffold_result. The directory will contain a graphml file for each Scaffold map and two sub-folders called clusters_data and landmarks_data.

These folders contain downsampled single-cell data for the clusters and landmarks, to be used for visualization. The clusters_data folder will contain a separate sub-folder for each graphml file, containing the data specific to that sample. The data is split in multiple rds files, one for each cluster (or landmark in landmarks_data).

If the Scaffold analysis was constructed from data that was pooled before clustering (i.e. using grappolo::cluster_fcs_files_groups), the clusters_data folder will also contain a subfolder called pooled, containing the pooled data, in addition to the sample-specific folders described above.

Using the GUI

A GUI is available to launch either an unsupervised graph analysis or a Scaffold analysis. The GUI allows you to specify all the input options in a graphical environment, instead of having to write R code.

To launch the GUI type the following in your R console

vite::vite_GUI()

When the GUI starts you will be prompted to select a working directory. This directory must contain all the files that you want to include in the analysis. Select any file in that directory, and the directory that contains the file will be selected as working directory.

Note that if you are running an unsupervised graph analysis and you are including a file with samples metadata, this file needs to include a column called filename that matches the individual rows of the metadata table with the name of the clustered.txt files you are using as input for the analysis. Please refer to the documentation of the R function get_unsupervised_graph_from_files for more information about metadata

vite's People

Contributors

pfgherardini avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vite's Issues

Bug report - complete_forceatlas2test - overlap_method

Hello,

I noticed that the complete_forceatlas2 function does not accept overlap.method ="expand". I debugged the issue and I found out that there is a small typo in the if statement in the complete_forceatlas2 as follows:

This condition:

else if(overlap_method == "expand")
G <- adaptive_expand(G, overlap.iter)

need to be modified to:

else if(overlap.method == "expand")
G <- adaptive_expand(G, overlap.iter)

Thanks,

Ghaith,

Export data as fcs files containing community_id data

Hi,

I want to use the community clustering (by vite) to make a set of landmark files for a second map. I could not find the community id in the rds data generated by vite. Is it possible to generate fcs files with this information?

I used the following code to make the fcs files;

rdsfilelist <- list.files("path-to-data/",
full.names = TRUE)

for (i in 1:length(rdsfilelist)){
tempdata <- readRDS(rdsfilelist[i])
data <- rbind(data, tempdata)
}

flow.frame <- as_flowFrame(as.matrix(data))
write_flowFrame(flow.frame, "you_file.fcs")

Can't install vite, complier problem with Mac?

Hi,

I have installed both grappolo and panorama successfully. However, I encountered issue while trying to install vite.
After spending half of the day to figure out what is happening, I believe it has to do with R/Rstudio doesn't recognize Xcode command line tools that I have installed on my Mac?

I am running a Macbook pro with the following specs:

Snip20200709_2

One approach I found online is to uninstalling Xcode, both R and Rstudio, then install Command line tools again via
' xcode-select --install '
on terminal. Finally, I reinstalled R and Rstudio via
'brew cask install r '
'brew cask install rstudio '.

Unfortunately, this doesn't resolve the issue, and I still get the same error messages to ask me install Build Tools (I clicked Yes each time, but nothing changed).

Snip20200709_4
Snip20200709_3

I also have gcc installed via brew.
Here is the Command Line Tools library in my Mac.

Snip20200709_5

Anyone has encountered this type of issue before and know how to resolve it?

Thanks in advance!

Creating landmarks

Hi Federico

I'm trying to build my own landmark file but unlike the sample data from ParkerICI to try Scaffold, I use different cytof panels for every cell type (T cells, B cells, etc)

Is there any way to use all these fcs files generated from different panels to create a single landmark file?

Thanks
Juan

Error in `[.data.frame`(tab, , col.names) : undefined columns selected

Hi,

I'm trying to execute "run_scaffold_analysis" function but I got the following error message:

Processing /XXXX/XXX/XX/Tissue.pooled.clustered.txt
Running with Edge weight: 7.000000
Error in [.data.frame(tab, , col.names) : undefined columns selected

Could you give some tips to solve this problem. In the text file you will find my code.
Input FCS file coming from facs symphony.

vite.R.txt

vite::

Hi Federico,

Just FYI.
The open GUI command is misleading, because it opens the GUI even if the library is not opened.
If someone like me, who is not very familiar with the whole coding thing, try to use it will have a hard time figuring out why it is not working, since the open library code is not written anywhere in the instruction page.

Cheers!
Chiara

Problem with forceatlas2.R stopping.tolerance

[What follows seems actually a mistake form my part]
Dear Pier Federico,

I have noticed a small problem with your layout_forceatlas2() function that could be easily solved.
Your package is really useful to me but it appears that the force atlas function is pretty slow on relatively large networks (above 2000 nodes).
In your function, you have the stopping.tolerance parameter that impacts the speed of the layout. However, the function does not really allow to fix the parameter, as it is actually fixed within the function:

v.count <- igraph::vcount(G)

    if(v.count >= 2000)
        barnes.hut <- TRUE
    if(v.count > 2000)
        stopping.tolerance <- 0.01
    else if(v.count > 800)
        stopping.tolerance <- 0.005
    else
        stopping.tolerance <- 0.001

We are thus stuck with a 0.01 value. ForceAtlas2 creators advise a 0.1 value under 5000 nodes and 1 above. So moving the tolerance value could be really helpful to increase the rapidity of the function. I imagine that it could be fixed easily just by adding a fixing.tolerance parameter, set to FALSE by default:

v.count <- igraph::vcount(G)

    if(v.count >= 2000)
        barnes.hut <- TRUE

if(fixing.tolerance=FALSE){
    if(v.count > 2000)
        stopping.tolerance <- 0.01
    else if(v.count > 800)
        stopping.tolerance <- 0.005
    else
        stopping.tolerance <- 0.001
}

Many thanks for your work on this package.
Best,

Aurélien

error in scaffold map

Dear Federico,

I have the problem with generating of scaffold map. To show you an error I have attached this part of code here below. I would like to ask you if you have ever experienced the same proble. I would like to quickly explain how I prepare files for analysis. I take results of flowsom clustering and write txt and Rds files (the same format of file which can be generated by grapolo). I don't have any problems to make unsupervised graph. However when I add landmarks files I can't finish the analysis. Those files were written as flowSet and exported from R.

Thank you in advance!

vite::run_scaffold_analysis(input.files, ref.file = "clustered_single_samples/allevents.fcs.clustered.txt", landmarks.data, col.names)
Processing clustered_single_samples/allevents.fcs.clustered.txt
Running with Edge weight: 7.000000
Hard removing some connections to landmarks using threshold: 1.000000
First iteration
Stopping tolerance: 0.001000

Total number of iterations: 2844
Second iteration with prevent overalp
Stopping tolerance: 0.001000

Total number of iterations: 3141
Adding inter-cluster connections with markers:
ICAM PD_L2 CD163 ..........

Weight factor:0.700000
First iteration
Stopping tolerance: 0.001000

Total number of iterations: 13410
Second iteration with prevent overalp
Stopping tolerance: 0.001000

Total number of iterations: 3294

### Error in V(g.temp)$highest_scoring_edge[i] <- max.edge :
replacement has length zero

Cant install vite; command line issue?

Hi,

Ive managed to install both grappolo and panorama, but i cant get vite to work. I have installed xcode and apropriate command line tools. If i paste "xcode-select -p" in terminal i get a path to Xcode.app/.... . If i run "xcode-select --install" i get:
xcode-select: error: command line tools are already installed, use "Software Update" to install updates.

when i try to install:

devtools::install_github("ParkerICI/vite")
Downloading GitHub repo ParkerICI/vite@master
✔ checking for file ‘/private/var/folders/_8/fpg279xx6lg2r8dhph4jpl0d91tsw8/T/RtmpPob3VF/remotes51833b151a/ParkerICI-vite-33cecc0/DESCRIPTION’ ...
─ preparing ‘vite’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘vite_0.4.6.tar.gz’
Warning: invalid uid value replaced by that for user 'nobody'
Warning: invalid gid value replaced by that for user 'nobody'

Installing package into ‘/Users/mmasg/Library/R/3.5/library’
(as ‘lib’ is unspecified)

  • installing source package ‘vite’ ...
    ** libs
    /opt/local/bin/g++-mp-4.8 -arch x86_64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Users/mmasg/Library/R/3.5/library/Rcpp/include" -I/usr/local/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
    In file included from /opt/local/include/gcc48/c++/cmath:44:0,
    from /Users/mmasg/Library/R/3.5/library/Rcpp/include/Rcpp/platform/compiler.h:100,
    from /Users/mmasg/Library/R/3.5/library/Rcpp/include/Rcpp/r/headers.h:59,
    from /Users/mmasg/Library/R/3.5/library/Rcpp/include/RcppCommon.h:29,
    from /Users/mmasg/Library/R/3.5/library/Rcpp/include/Rcpp.h:27,
    from RcppExports.cpp:4:
    /opt/local/lib/gcc48/gcc/x86_64-apple-darwin13/4.8.5/include-fixed/math.h:45:23: fatal error: sys/cdefs.h: No such file or directory
    #include <sys/cdefs.h>
    ^
    compilation terminated.
    make: *** [RcppExports.o] Error 1
    ERROR: compilation failed for package ‘vite’
  • removing ‘/Users/mmasg/Library/R/3.5/library/vite’
    Error in i.p(...) :
    (converted from warning) installation of package ‘/var/folders/_8/fpg279xx6lg2r8dhph4jpl0d91tsw8/T//RtmpPob3VF/file5181d8c05ae/vite_0.4.6.tar.gz’ had non-zero exit status

Warning: Error in read.table: more columns than column names

Files were clustered with grappolo.
When attempting to read the files into vite, this error comes up:

Warning: Error in read.table: more columns than column names
47: stop
46: read.table
45: [C:\Users\XXX\Documents\R\win-library\3.4\vite\shinyGUI/server.R#151]
2: shiny::runApp
1: vite_GUI

What causes this issue and what's the best way to fix it?

Error in run_scaffold_analysis

Hello,
Thank you a lot for creating and maintaining this package.
I have a problem when I try to run scaffold_analysis:

library(vite)
files_list <- list.files(path = ".", pattern = "*.fcs", full = TRUE, ignore.case = TRUE)
input.files <- c("group1.clustered.txt","group2.clustered.txt")
col.names <- as.character(fs@frames$683_myelo.fcs@parameters@data$desc)
landmarks.data <- load_landmarks(files.list = files_list, asinh.cofactor = 5, transform.data = T)
run_scaffold_analysis(input.files, ref.file = input.files[1],
landmarks.data = landmarks.data, col.names = col.names)
###############
Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

I have found that the problem is inside function get_distances_from_landmarks, in line:
image

colnames(dd) <- tab$cellType

When I looked inside tab and dd files i found this:
tab:
image
dd:
image

So program is trying to sign one element vector tab$cellType="myelo" into colnames(dd) which requires 50 of them.

I would like to ask you for help with this issue.

I'm using R 4.0.5, Rstudio server 1.4.1006 and ubuntu 20.04
You can download files from:
https://www.dropbox.com/s/dcia2xlhy11prjw/scaffold_issue.tar.gz?dl=0

Best,
Karol Jacek

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.