Giter VIP home page Giter VIP logo

ggally's Introduction

GGally: Extension to ggplot2

R build status

CRAN_Status_Badge DOI RStudio community R-CMD-check

ggplot2 is a plotting system for R based on the grammar of graphics. GGally extends ggplot2 by adding several functions to reduce the complexity of combining geoms with transformed data. Some of these functions include a pairwise plot matrix, a scatterplot plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

Installation

To install this package from GitHub or CRAN, do the following from the R console:

# Github
library(devtools)
install_github("ggobi/ggally")
# CRAN
install.packages("GGally")

ggally's People

Contributors

bbolker avatar bigbeardesktop avatar briatte avatar cpsievert avatar dicook avatar edwinth avatar eibanez avatar elbamos avatar ewallace avatar ewenharrison avatar fawda123 avatar fubar2 avatar gitter-badger avatar gokceneraslan avatar heike avatar jakob-r avatar jcrowley11 avatar jonathan-g avatar larmarange avatar muschellij2 avatar ogdenkev avatar otoomet avatar owenjonesuob avatar schloerke avatar simonpcouch avatar steltenpower avatar teunbrand avatar treysp avatar vlepori avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggally's Issues

ggpairs: no possibility to change axis and variable label font sizes

Right now when using ggpairs for drawing > 4*4 scatterplot matrices, the axis and variable labels on diagonal become very small for printing purposes. Is it possible to adjust the font size in ggpairs()?

I already attempted to adjust the font size directly in ggally_diagAxis and ggpairs functions but with no success. The hard-coded font changes work when diagAxis is called independently but not when called by ggpairs.

2D density plots not aligned with other plots

Data shown in a ggpairs 2D density plot is not aligned with the data shown in other plots (e.g. a scatter plot, or a diagonal 1D density plot; see the red arrow), and the axes of a 2D density plot are not aligned with the axes of the other plots (see the red ovals).

To reproduce:

set.seed(0)
x<-data.frame(
  x1=c(rnorm(100),rnorm(1,1)),
  x2=c(rnorm(100),rnorm(1,10)),
  x3=c(rnorm(100),rnorm(1,10)))
ggpairs(x,
  upper=list(continuous="density"),
  diag=list(continuous='density'),
  lower=list(continuous='points'))

bug

ggcorr: have ggcorr accept cor results

... would be nice if I could just pass cor results to ggcorr (as it is the case in corrplot) – that gives users more options for which correlation coefficient to use etc.

(this is a duplicate to briatte/ggcorr#2 because I wasn't sure that repo was still active)

ggplot2 1.1.0

A new version of ggplot2 will be out soon. It breaks many different things in GGally--by which I mean, many of the tests now fail.

I would be very surprised if the new version definitely broke any of GGally's functions, but it sure does break quite a few tests right now (including several tests for ggnetwork, for instance).

PR #104 makes ggnet + ggnet2 compatible with ggplot2 1.1.0.

factor titles missing off left side in ggally_cor with colour grouping

When using ggally_cor(), if your colour factors are at all long, in a small graph, then they drop off the page to the left, as the text is right aligned, and anchored near the centre of the box.

See for example:
ggally_cor(data=esoph[,c(1,4,5)], mapping=aes(x=ncases, y=ncontrols, colour='agegp'))

(you might need to make the graph smaller)

Overlap in ggally_diagAxis

If you have three values of a categorical variables in ggpairs, then the middle one will overlap the title in the diagonal square. There needs to be a way to dodge the title up/down. Perhaps something like a diagAxis.title.position = c(0.5, 0.5) option to ggally_diagAxis() would work?

scatmat doesn't like non-factor colour columns

This is a wish-list item/easy to work around, or a request for documentation enhancement, I guess ...

library("GGally")
data(flea)
scatmat(flea,columns=2:4,color="species")  ## fine
flea$species <- as.character(flea$species)
scatmat(flea,columns=2:4,color="species")
## Error in eval(expr, envir, enclos) : object 'colorcolumn' not found
packageVersion("GGally")  ## 1.0.0

The documentation does say "factor variable":

color: an option to group the dataset by the factor variable and color them by different colors. Defaults to ‘NULL’

but I've become so lazy/used to the idea that character variables will Just Work that it would be nice if this worked too ...

Make "[" and "[<-" functions

Would remove the use of "getPlot" and "putPlot"... they're clumsy.

Would allow for multiple sets and multiple gets... just like a matrix does.

Overall correlation in ggpairs with colour grouping?

Asked at http://stackoverflow.com/questions/12137614/overall-correlation-in-ggpairs-with-colour-grouping, but I don't think it's possible at the moment. It'd be great if there was an option to include the total correlation when grouping by colour (in black, I guess), in the ggally_cor() plot.

Had a look at the code, but the ddply line is a bit beyond me at the moment, and I'm not sure if this would be good default behaviour anyway. Perhaps a switch include_all=FALSE?

close/adjust gap between subplots

It would be nice to be able to squash the subplots within a ggpairs plot together ... I spent a while poking at the code but couldn't come up with an easy way to do it (since I don't yet understand the code well enough). Among other things, you would make Edward Tufte happy ...

diagonal bar invalid when summarizing negative values

Summarizing negative values using ggpairs with diag=list(continuous='bar') yields a spurious plot (I would expect a histogram, but the diagonal plots show bars going below the axis). Maybe, I'm missing something (?).

To reproduce:

set.seed(0)
x<-data.frame(x1=c(rnorm(100),rnorm(10,5)),x2=c(rnorm(100),rnorm(10,5)))
ggpairs(x,diag=list(continuous='bar'))

Note:
Warning messages:
1: In loop_apply(n, do.ply) : Stacking not well defined when ymin != 0
2: In loop_apply(n, do.ply) : Stacking not well defined when ymin != 0

bug2_bad

test coverage for ggnet

Hey @briatte! Is it possible for you to add some tests for ggnet?

I do not know how to reach all of the awesome features.

My typical workflow for testing a specific function:

library(devtools); library(testthat)
library(covr); # install_github("jimhester/covr")

#See percentage
load_all(); function_coverage("ggnet", test_file("tests/testthat/test-ggnet.R"))
# Loading GGally
# ggnet : .........
# 
# Package Coverage: 64.20%
# ./R/ggnet.R: 64.20%

# See missing lines
load_all(); zero_coverage(function_coverage("ggnet", test_file("tests/testthat/test-ggnet.R")))
# Loading GGally
# ggnet : .........
# 
#                                                      filename first_line
#1  /Users/barret/Copy/git/R/ggobi_org/ggally/ggally/R/ggnet.R        105
#2  /Users/barret/Copy/git/R/ggobi_org/ggally/ggally/R/ggnet.R        124
# ...
#    last_line value
#1        105     0
#2        126     0
# ...

Thank you!

Group separately to colour

Why does ggally use colour for grouping/faceting? For points diagrams, this is irrelevant. For 'facethist' though, it would be nice to be able to have a stacked bar chart, which means no grouping, but filling by colour. I guess that gets a bit confusing with the boxplots though...

pass a function directly to code

alter the plot object to be a list

  • function
    ** will contain parameters with wrapper function. avoids MASSIVE confusion
  • params - NO
  • data pointer?
  • geoms to be added?

Since functions are going to be able to be passed directly, we will not be able to match the names as in issue #12 . Therefore, the parameters should be closely tied to the function definition.

So... something like...

upper = list(
    # will create the ggally_points function with parameters c(size = 5, color = "grey90")
    continuous = wrap_with_params("points", size = 5, color = "grey90"),

    # will wrap my_custom_function with parameter c(size = 3), independent of the "points" params
    discrete = wrap_with_params(my_custom_function, size = 3)
)

ggparcoord: "uniminmax" scaling error when data frame columns contain only one single value

Dear ggally developers,

I found the above descriped issue when I attempted to plot a parallel coordinate plot using a data frame that contained columns that stored only one single value for all observations / rows.

Since I programmed a quick no bells and whistles fix I'll submit a pull request with the solution that worked for me. Maybe it is useful or maybe you can cherrypick something out of it.

Thank you for the work you put into developing and maintaining this package. It helps me a lot.

Best regards,
Christoph

attributes not passed through to e.g. correlation plots

These two graphs look identical:

ggpairs(iris)
ggpairs(iris,corSize=7)

These work as expected:
ggally_cor(iris,aes(x=Sepal.Length,y=Sepal.Width),corSize=20)
ggally_cor(iris,aes(x=Sepal.Length,y=Sepal.Width),corSize=7)

Is there any way to pass attributes through to the correlation functions etc?

spearman corMethod

I'm trying to use the corMethod parameter (from ggally_cor via ggpairs) to set the correlation method from pearson to spearman

ggpairs(.....
lower=list(continuous="cor", corMethod="spearman", combo="dot", discrete="facetbar", params=c(cex = 3) ),
...)

the corMethod="spearman" seems not to be honored, instead giving pearson correlations.

looking into the source I see:

ggally_cor
...
function (data, mapping, corAlignPercent = 0.6, corMethod = "pearson",
corUse = "complete.obs", ...)
{
corMethod <- as.character(substitute(corMethod))
...

however the transformation as.character(substitute(... seems to be modifying corMethod to be 'corMethod'

corMethod="spearman"
as.character(substitute(corMethod))
[1] "corMethod"

Is this what is desired? is something more complicated being passed to ggally_cor as corMethod?

  • Nathan

Better way to pass parameters to ggpairs subplots?

Seems like parameters are being named ungainly (corAlignPercent, instead of just AlignPercent), because there's no esy way to pass the parameters?

Might it not be better to have a params argument structured as a two-level list? ie.
params = list(cor = c(AlignPercent), diagAxis = c(position='identity'))

str_c vs. paste, str_detect vs. grepl

GGally imports stringr for two functions, str_c and str_detect.

Any particular reason why paste, paste0 and grepl are not sufficient here? They should be, unless there's some vectorisation needed.

Since stringr relies on stringi, it's quite a heavy import (unlike reshape which is a lightweight one; re #92). I very much like stringr and stringi, but I'm thinking in streamlining terms here.

non-square matrices

Is there any way to produce non-square matrices, i.e. to plot one set of variables against another? e.g. in survey research I might want to plot age, sex etc as background variables against foreground variables like political attitudes, and I don't want to see the background variables plotted against themselves in the same plot. So I would like to just see age, sex in the rows of the matrix and e.g. a set of attitudes variables in the columns.

Cannot parse column names containing forward slash characters

For example, the ggpairs function chokes on data frames having forward slash characters in column names.

library("ggplot2")
library("GGally")

x <- 1:5
x <- data.frame(x, log(x))
colnames(x) <- c("col_1", "col/2")

# OK
ggplot2::plotmatrix(x)

# Error in `[.data.frame`(data, , c(xCol, yCol)) :
#   undefined columns selected
GGally::ggpairs(x)

# OK
colnames(x) <- c("col_1", "col_2")
GGally::ggpairs(x)

Output of sessionInfo() (run after the above snippet):

R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GGally_0.4.5    reshape_0.8.4   plyr_1.8        ggplot2_0.9.3.1

loaded via a namespace (and not attached):
 [1] colorspace_1.2-4   dichromat_2.0-0    digest_0.6.4       grid_3.0.1         gtable_0.1.2       labeling_0.2
 [7] MASS_7.3-26        munsell_0.4.2      proto_0.3-10       RColorBrewer_1.0-5 reshape2_1.2.2     scales_0.2.3
[13] stringr_0.6.2      tools_3.0.1

ggparcoord can't handle aes that are outside of columns

This fails:

ggparcoord(mtcars, columns = c(1,3:6), mapping = aes(size = gear))

The structure of the ggparcoords drops all columns at the beginning. Then, it adds them back as necessary.

Either add all aesthetic columns back (which can be tricky), or keep all columns all the time (which would be best).

tell (at)matloff when done.

datetime variables

Hi,

ggpairs doesn't work with the datetime scale. Given that it is just a continuous scale with fancy labeling it should work but it results in the error:

Error in Math.difftime(1816, base = 10) : 
  'log' not defined for "difftime" objects

I assume that is a known limitation? Otherwise I'd gladly provide a MWE. :)

Thx
Stefan

GGally 0.5.0 on R 3.1.2

ggcorr errors out on matrix

it would appear that ggcorr doesn't like matrices.

ggcorr(data.matrix(nba[,-1]))

yields:

Error in cor(data[1:ncol(data)], use = method) : 
  supply both 'x' and 'y' or a matrix-like 'x'

That seems odd, because the documentation says something matrix-like is expected, which in this case, I have actually supplied.

(this is a duplicate to briatte/ggcorr#1 because i wasn't sure that repo was still active)

Cannot set font size of correlation plots

The accepted answer on this Stack Overflow thread seems not to work anymore in GGally 0.4.5. (I don't know if it worked before.)

Additionally, methods 1) and 3) from this lonely thread don't work either.

Here is a sample snippet:

num.vars <- 2
num.rows <- 50
require(GGally)
require(data.table)

tmp <- data.table(replicate(num.vars, runif(num.rows)),
              class = as.factor(sample(0:1,size=num.rows, replace=TRUE)))

tmp.plot <- ggpairs(data=tmp, diag=list(continuous="density"), columns=1:num.vars,
                colour="class", axisLabels="show", params=list(corSize=9))
print(tmp.plot)

Additionally, is there a list of what params are accepted by ggpairs and where they're passed too? I looked through both the ggplot and ggpairs manuals, but I'm still firing blind.

randomly failing test

@briatte, ggnet/ggnet2 fail randomly and I don't know why.

gg-plots : ............
ggcorr : .........
gglyph : ............................
ggmatrix_add : ....
ggmatrix_getput : .......
ggmatrix : ....................
ggnet : ..1
ggnet2 : ................2
ggnetworkmap : ...........................................
ggpairs : .......................................................................
ggparcoord : .............................................................
ggscatmat : ......
ggsurv : ...........................
utils : .....


1. Error: examples -------------------------------------------------------------
Not a graph object
1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls)
2: eval(code, new_test_environment)
3: eval(expr, envir, enclos)
4: gplot.layout.circle(n) at test-ggnet.R:47
5: as.edgelist.sna(d)
6: is.bipartite(x)
7: stop("Not a graph object")
8: .handleSimpleError(function (e) 
   {
       e$calls <- head(sys.calls()[-seq_len(frame + 7)], -2)
       signalCondition(e)
   }, "Not a graph object", quote(is.bipartite(x)))

2. Error: examples -------------------------------------------------------------
Not a graph object
1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls)
2: eval(code, new_test_environment)
3: eval(expr, envir, enclos)
4: ggnet2(n, size = "degree") at test-ggnet2.R:106
5: sna::degree(net, gmode = is_dir, cmode = ifelse(x == "degree", "freeman", x)) at /Users/barret/Copy/git/R/ggobi_org/ggally/ggally/R/ggnet2.R:484
6: as.edgelist.sna(dat)
7: is.bipartite(x)
8: stop("Not a graph object")
9: .handleSimpleError(function (e) 
   {
       e$calls <- head(sys.calls()[-seq_len(frame + 7)], -2)
       signalCondition(e)
   }, "Not a graph object", quote(is.bipartite(x)))

Error: Test failures

I then restart R and re-test the package and there are no errors, but a new warning.

Loading required package: devtools
Loading GGally
Loading required package: testthat
gg-plots : ............
ggcorr : .........
gglyph : ............................
ggmatrix_add : ....
ggmatrix_getput : .......
ggmatrix : ....................
ggnet : network: Classes for Relational Data
Version 1.11.2 created on 2014-09-23.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
                    Mark S. Handcock, University of California -- Los Angeles
                    David R. Hunter, Penn State University
                    Martina Morris, University of Washington
                    Skye Bender-deMoll, University of Washington
 For citation information, type citation("network").
 Type help("network-package") to get started.

sna: Tools for Social Network Analysis
Version 2.3-2 created on 2014-01-13.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
 For citation information, type citation("sna").
 Type help(package="sna") to get started.


Attaching package: ‘sna’

The following object is masked from ‘package:network’:

    %c%


Attaching package: ‘ggplot2’

The following object is masked from ‘package:GGally’:

    rel

.............................................
ggnet2 : ...................................................
ggnetworkmap : Loading required package: igraph

Attaching package: ‘igraph’

The following objects are masked from ‘package:sna’:

    %c%, betweenness, bonpow, closeness, degree, dyad.census, evcent,
    hierarchy, is.connected, neighborhood, triad.census

The following objects are masked from ‘package:network’:

    %c%, %s%, add.edges, add.vertices, delete.edges, delete.vertices,
    get.edge.attribute, get.edges, get.vertex.attribute, is.bipartite,
    is.directed, list.edge.attributes, list.vertex.attributes,
    set.edge.attribute, set.vertex.attribute

Loading required package: maps
Loading required package: geosphere
Loading required package: sp
Loading required package: mapproj
...........................................
ggpairs : .......................................................................
ggparcoord : ........................................Loading required package: scagnostics
Loading required package: rJava
.....................
ggscatmat : ......
ggsurv : Loading required package: survival
Loading required package: splines
...........................
utils : .....

Warning message:
The shape palette can deal with a maximum of 6 discrete values because
more than 6 becomes difficult to discriminate; you have 10. Consider
specifying shapes manually. if you must have them. 

Any ideas?

ggtable

Hi,

I wrote this function a few months ago, to plot information that is often shown as tables of counts or percentages (typically, crosstabulations in surveys).

Do you think it would make a good addition to GGally? I cannot remember why I did not submit it at the time. Please let me know :)

Update: now I remember that we already discussed it.

Error in unit(...) with example

When I try to run the example code, I get the following error:

> ggpairs(iris[,3:5])
[1] "box"
[1] FALSE
Error in unit(x, default.units) : 'x' and 'units' must have length > 0

EDIT: when I run this, it draws the top row ("Petal.Length", Corr: 0.963, a box plot) and the first panel of the second row - a scatterplot.

It works OK with ggpairs(iris[,1:3]), however

I'm using ggplot2_0.9.1 and GGally_0.3.2

axisLabels

change axisLabels to boolean

add variable to to hide / show plot axis

show diag fn of axisLabels

this is a much cleaner setup

ggplotGrob is missing?

I hope someone is watching this list! I tried reporting this to the maintainer listed on CRAN but got no response. I see that Di Cook updated about a week ago. It still doesn't work for me; with GGally_0.3.2 I get:

ggpairs(iris[, 3:5], upper = list(continuous = "density",

  • combo = "box"), lower = list(continuous = "points", combo = "dot"), 
    
  • diag = list(continuous = "bar", discrete = "bar"))
    
    Error in ggally_diagAxis(ggally_data, aes(x = Petal.Length)) :
    could not find function "ggplotGrob"

when trying to run the example. Thanks to anyone who has time to fix this or point me in the right direction! Bryan

Correct scale of density plots with multiple groups

Hi,

I want to use ggpairs to have a diagonal with density plots of each variable separated by a grouping variable. I'm not being able to get the right plots because of a scale issue. To illustrate my point, I'll use the following artificial dataset:

group=as.numeric(cut(runif(100),c(0,1/2,1),c(1,2)))
x=rnorm(100,group,1)
x[group==1]=(x[group==1])^2
y=2*x+rnorm(100,0,0.1)
data=data.frame(group=as.factor(group),x=x,y=y)

Using ggpairs, I get the following plot

library(ggplot2)
library(GGally)    
ggpairs(data,columns = 2:3,colour="group")

x1bkz

Now, compare the top left plot to the density plot of variable x obtained using plain ggplot2:

ggplot(data, aes(x = x, colour = group)) + geom_density() 

x6w8f

We can see that the y scale of the red and blue curves in ggpairs (the first figure) are not the same, which may lead to misleading conclusions. How can I correct this in ggpairs? Is this a bug?

log scales for ggpairs()?

I'm wondering whether it's possible to do a ggpairs() plot using log scales. I thought I could do

ggpairs(dataframe) + scale_y_log10() + scale_x_log10()

but that seems to not be the case. I also tried the following:

x <- ggpairs(dataframe)
for(i in 1:ncol(dataframe)
  for(j in 1:i)
    x <- putPlot(x, getPlot(x,i,j) + scale_y_log10() + scale_x_log10, i, j)

but I get a bunch of warnings:

Scale for 'y' is already present. Adding another scale for 'y', which will replace the existing scale.
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will replace the existing scale.
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale.
...

and the output doesn't look so great - the numerical axes labels in the diagonal aren't visible.

reshape vs. reshape2

@briatte makes a good point.

I know reshape2 is faster. My main motivation was to use the tips dataset and to use melt. I honestly haven't made a big/any effort into reshape2.

@briatte, can you take a look for issues or make a pull request that upgrades reshape to reshape2?

Thank you,
Barret

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.