renozao / faqr Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 56 KB

Frequently Asked Questions on R: my personal Ask Just Once system for my friends' R problems...

faqr's People

Watchers

faqr's Issues

sapply vs. apply

Hi,
I want to raise a painful subject...
Although the apply functions should make my life easier, they always confuse me..
I would like to take a simple data.frame and do a simple calculation on its rows.

Thanks,
Rachelly.

How to get the GSE accession from a given eset

I have an eset and want to know what GSE it was taken from. Is there a function on expression sets that supply this information?
(This is needed since I get the eset in a function and can't know in advance what is the GSE)
Thanks!

Converting a value of a variable to a variable

I sometime need to use the value of a variable as a variable of its own, and don't know how to convert it...
For example when I want to draw a plot using the following code:
ggplot(DF, aes(Disease, GeneOrPW, fill = Type)) + ...

Where DF is a data.frame that contains the columns "Disease", "Type", "PW_corr" and "Gene_corr"
GeneOrPW is a variable that either contains the string "PW_corr" or "Gene_corr".

The error I currently get when trying to run the ggplot code is:
Error in eval(expr, envir, enclos) : object 'GeneOrPW' not found

I tried to change it to:
ggplot(DF_merged, aes(Disease, eval(parse(GeneOrPW)), fill = Type)) +
And got the error:
Error in parse(GeneOrPW) : object 'GeneOrPW' not found

Thanks!

Non circular source in R

Is there a way to avoid circular sourcing of two files in R?
Meaning if I have 2 files that source each other - I want to avoid an endless process of sourcing once I run one of them. In C there is a trick that makes the system avoid this endless sourcing and makes sure each file is being sourced once only..

Thanks!

Column annotations in aheatmap

Hi,
I don't really understand how to insert column annotations in aheatmap, see code below and attached data file. I tried to create a data.frame such as the one shown in the example, but it didn't work. I would appreciate your help
Thanks!
Rachelly.

library(NMF) 
data = read.table("shira.csv", sep=",", row.names=1, stringsAsFactors = FALSE) 
colnames(data) = as.character(data[1,]) 
data=data[-1,] 
# converting rows to numeric type 
for (i in 1:ncol(data)) 
  data[,i] = as.numeric(data[,i]) 

color_per_group = factor(c("red","blue","yellow","orange","green","black"))
names(color_per_group)= unique(colnames(data))
ColAnn = data.frame(Var1 = color_per_group[colnames(data)], Var2 = colnames(data))
ColAnn$Var2 = as.character(ColAnn$Var2)

aheatmap(data, labRow = ColAnn)  # error
aheatmap(data, labRow = colnames(data)) #error

Possible to export long table to multiple pages?

My problem - the table has more rows that can fit to one page, and R simply cropps all the rows that doesn't fit the first page.

My table("sorted_uni_table") looks like this:

And my output function is textplot() from gplots library:

textplot(sorted_uni_table,valign="top",halign="center",cex=1)

10x!
Ariel

Add text to existing PDF files

I have an existing PDF file and want to add to each page a title taken from a character vector. Any idea how to do this?

I used the unix command "gs" to create the original PDF, which is a merge of specific pages from different PDF files, but couldn't find a way to add a stampq title using this command...:
args = paste0(" -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dFirstPage=2 -dLastPage=2 -sOutputFile=summary_adj_GS.pdf ", files)
system2("gs", args)

GEO Annotations

Hi,
I downloaded a geo annotation for a GSE, and it seems there are more probes in the GSE than in the annotation. How is this possible?
Thanks,
Rachelly.

mm_gse <- getGEO("gse7187", destdir="C:\Program Files\R\GSEs")[[1]]
library("moe430a.db")
MM_p2genes <-unlist(as.list(moe430aENTREZID[mappedkeys(moe430aENTREZID)]))
length(featureNames(mm_gse))
[1] 22690
dim(moe430aENTREZID)
[1] 21085 2

Legend for aheatmap circle sizes

Hi,
I use the type = "circle" and y parameters in aheatmap, and get the desired figure with data points varying in size based on the y parameter. But no legend for the circle size is drawn..
Any way to draw such a legend?
Thanks,
Rachelly.

2 questions about aheatmap

I need a different color for NA values. I tried to use na.color but that doesn't work well since I have columns that are NAs only, and then I get the following error:
Error in hclust(d, method = hclustfun) :
NA/NaN/Inf in foreign function call (arg 11)
Is there a work-around for this?
I would like to have a fixed range for the colors I use in the heatmap, since I plot different heatmaps and want to be able to compare them (currently, each matrix has a different range of values so the colors are not comparable). Is there a way to do this?

Thanks,
Rachelly.

Printing messages in R

I want to print a message containing a few variables in a line, and some text in between. Is it possible to do it elegantly, as in C?

The closest solution I found is

x<-1
y<-2
print(c("x=",x,"y=",y))
[1] "x=" "1" "y=" "2"

But this is not so elegant, since the variables are not really characters and there are quote marks all arounf. Printing a list doesn't give the desirable results either:

print(list("x=",x,"y=",y))
[[1]]
[1] "x="
[[2]]
[1] 1
[[3]]
[1] "y="
[[4]]
[1] 2

Thanks,
Rachelly.

More probes in bioc annotations than in the GSE

Hi,
I gave a question which is in some way "opposite" to what I asked in issue #3.
When I check which probes are mapped to the gene "404636" in GPL5188 using GPL5188ENTREZID I get the following list:

mapp = unlist(as.list(GPL5188ENTREZID[mappedkeys(GPL5188ENTREZID)]))
mapp[mapp=="404636"]
3266973 3266988 3266989 3266997 3267000 3267001 3267002 3267003 3267004 3267005
"404636" "404636" "404636" "404636" "404636" "404636" "404636" "404636" "404636" "404636"

But it appears that only one probe was used in this platform in my GSE:
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?view=data&acc=GSM1163714&id=15498&db=GeoDb_blob99

How is this possible that so many probes are available but are finally not on the chip?

Thanks,
Rachelly.

Get the feature names while using esApply

I'm using esApply to calculate the correlation between every pair of features in an E-set.

CreateCoXPRES_DB = function(eset, path)
{
esApply(eset,1, FUN=function(x)
{
res = esApply(eset,1,function(y) cor(x,y))
out = paste(y,res,sep="\t")
print(x)
write.table(out, paste(path,"x",sep=""), quote=FALSE,append=TRUE)
})
}

Since I want to track x and y in the function above - to know to what genes the correlation applies to, I want to get the fData. The output of the above coed is:

CreateCoXPRES_DB(mm_tmp, paste(path, "MM", sep=""))
GSM00001 GSM00002 GSM00003 GSM00004 GSM00005 GSM00006
6.901178 5.474642 5.404250 5.865132 5.786788 5.891266
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘fData’ for signature ‘"numeric"’

So it seems in the esApply the rows are transformed to numeric vectors and we cannot approach the eset meta-data.
The documentation I've looked at doesn't seem to address this issue: http://rgm3.lab.nig.ac.jp/RGM/R_rdfile?f=Biobase/man/esApply.Rd&d=R_BC

Is there a way to go around this besides writing this function as a normal apply function?

Thanks!

Colour scale position in aheatmap

I want to change the color scale position from the default top-right corner of the data matrix to say bottom-right corner. This http://renozao.github.io/NMF/devel/vignettes/aheatmaps.pdf says it can be achieved via layout parameter, like aheatmap(x, layout = '_'). When trying this I get

Error in aheatmap(hm, annCol = tracks, annColors = annColors, main = paste0(titleprefix, :
unused argument (layout = "_")

I'm using NMF_0.20.6, is layout available in development version only?
Thanks

Annotation data for GSE33377 (GPL5175)

I'm trying to find annotation data for GSE33377 (GPL5175) with no success.

I've already made various trials:

I used http://ailun.stanford.edu/platformAnnotation.php with the feature list, but it returned an empty file
I tried to download annotation data from ensembl (biomart), but feature list was incomplete (covered only 10% of probes)
I could not locate any Annotation DBI for this GSE (http://www.bioconductor.org/packages/2.13/data/annotation/)

Thanks

Partial matching when subsetting a dataframe

This really struck me.. When sub-setting a data.frame using the [] operator, partial matched names are picked! Is this how it's supposed to work??
I vaguely remember that I discussed this with Renaud once, but don't remember the conclusion we got to..
Thanks!
Rachelly.

x=data.frame("A"=c(1,2,3,4), "B"=c(1,2,3,4), "C"=c(1,2,3,4), row.names = c("123","345","1201","22"))

x
A B C
123 1 1 1
345 2 2 2
1201 3 3 3
22 4 4 4

x["123",]
A B C
123 1 1 1

x["120",]
A B C
1201 3 3 3

x["12",] # A row is not found, because there are 2 possible matches
A B C
NA NA NA NA

x[rownames(x) == "120",] # At least this works!!
[1] A B C
<0 rows> (or 0-length row.names)

Merge in R

I want to merge two data.frames by their row.names. When I do so I get an extra column named "Row.names":

> x
  [,1] [,2] [,3]
a    1    3    5
b    2    4    6
> y
    [,1] [,2] [,3]
aaa   11   13   15
b     12   14   16
> merge(x, y, by=0, all=TRUE)
  Row.names V1.x V2.x V3.x V1.y V2.y V3.y
1         a    1    3    5   NA   NA   NA
2       aaa   NA   NA   NA   11   13   15
3         b    2    4    6   12   14   16

Is there an easy way to make merge keep the row.names as row.names rather than as a new column?

I did this in a naïve way, but I'm sure there a way to do it inside the merge function, but didn't find it on the web..

> new_mat=merge(x, y, by=0, all=TRUE)
> rows = new_mat[,1]
> row.names(new_mat)= rows
> new_mat = new_mat[,2:length(colnames(new_mat))]
> new_mat
    V1.x V2.x V3.x V1.y V2.y V3.y
a      1    3    5   NA   NA   NA
aaa   NA   NA   NA   11   13   15
b      2    4    6   12   14   16

Thanks,
Rachelly.

Functions with the same name in R

Hi,
I just ran into a stupid bug in my code, where 2 functions had the same name.
I found that bug on run time and didn't get any warning about that when sourcing the files the functions were in. I guess when I sourced the file the second function just overrode the first one like it was any other variable...

How does compilation work in R? Is there any compilation at all?
Is there a smart way of finding these mistakes before run-time?

Thanks,
Rachelly.

renozao / faqr Goto Github PK

faqr's People

Watchers

faqr's Issues

Recommend Projects

Recommend Topics

Recommend Org