Giter VIP home page Giter VIP logo

rsqlserver's Introduction

RSQLServer

This package is archived as there is now an excellent, much better supported package odbc.

rsqlserver's People

Contributors

gitter-badger avatar imanuelcostigan avatar mnel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rsqlserver's Issues

Remove dplyr:: prefixes

The functions that make qualified calls to dplyr functions shouldn't work unless dplyr is first attached. This means that:

  1. RSQLServer's dplyr backend will not work unless dplyr itself is attached.
  2. This also means that library(RSQLServer) should rarely if ever be called. Should be installed, then call library(DBI) and library(dplyr) to use the SQL Server DBI backend and dplyr backend respectively.

DBI compliance

d> DBI::dbiCheckCompliance("SQLServer")
Compliance check for SQLServer
SQLServerDriver: OK
SQLServerConnection: NOT OK
  dbQuoteString, dbQuoteIdentifier
SQLServerResult: OK

problem with copying build_query from dplyr

build_query is a slightly revised version of build_query in dplyr, and the difference is basically the change from limit parameter to top papameter.

the problem is, most verbs in dplyr such as select, filter, setdiff will call update, which in turn will call build_query, but the build_query is just an ordinary function, not a generic function, and because of the namespace mechanism, the build_query called in these verbs is dplyr:::build_query, not RSQLServer:::build_query. At present, RSQLServer:::build_query is just called by RSQLServer:::head, as it's defined in RSQLServer.

to solve the confict between top and limit, it's not neccensary to keep a RSQLServer version of build_query. it will work that using limit parameter to generate TOP keyword in select or FETCH clause depending on whether offset is not null and db.version>=11 in sql_select.SQLServerConnection, like this,

if (!is.null(limit)) {
assertthat::assert_that(assertthat::is.number(limit))
limit <- build_sql("TOP ", dplyr::escape(limit))
}

assertthat::assert_that(is.character(select), length(select) > 0L)
out$select <- dplyr::build_sql("SELECT ", limit, " ",
dplyr::escape(select, collapse = ", ", con = con))
assertthat::assert_that(assertthat::is.string(from))
out$from <- dplyr::build_sql("FROM ", from, con = con)

if to include a RSQLServer version of build_query is neccesary, it's better to have a method of update for SQLServerConnection.

awful performance with big tables

when create tbl with tables more than millions of rows, it needs a long time to get tbl object returned. it's unreasonable and after profiling the source code, i found the S3 method db_query_fields.SQLServerConnection ate most of the time consumed, which in turn spent most of the time on "dbSendQuery(con, paste0("SELECT * FROM ", sql)".
it might be more reasonable to add "WHERE 0=1" or something similar to the end of the query to be sent, just like the db_query_fields method for DBIConnection.
printing tbl of big tables are also time consuming, but i havn't profiled them thoroughly.

Optimise db_query_fields methods

Investigate use of SELECT TOP 0 * FROM <TABLE> query per MS recommendation

Use TOP (or OFFSET and FETCH) instead of SET ROWCOUNT to limit the number of rows returned. These methods are preferred over using SET ROWCOUNT for the following reasons: As a part of a SELECT statement, the query optimizer can consider the value of expression in the TOP or FETCH clauses during query optimization. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be considered in a query plan.

jTDS Driver not found on OS X or Windows 7

Summary

dplyr-flavoured RSQLServer is unable to locate the jTDS driver included in the package on OS X (and on other *nix systems, I suspect). Given that the jTDS drivers are cross-platform, I suggest some basic platform checks to properly load the SQL Server drivers.

Reproduction

When attempting to use the package on OS X, the following error is encountered:

> dbConnect(SQLServer(), server = <serverName>, database = <databaseName>, user = <user>, password = <password>)
Error in is(object, Cl) : 
  error in evaluating the argument 'drv' in selecting a method for function 'dbConnect': Error in .jfindClass(as.character(driverClass)[1]) : class not found

Similarly, when the SQLServer() driver is invoked with the Package name, the same error is returned:

> RSQLServer::SQLServer()
Error in .jfindClass(as.character(driverClass)[1]) : class not found

However, these same commands on Windows run without error.

Solution

As described above, I believe it's a simple cross-platform bug (filesystem formatting and escapes vary). I've used the following pattern to circumvent similar issues in the past:

if(Sys.info()['sysname'] == "Darwin"){
  JDBC("com.microsoft.sqlserver.jdbc.SQLServerDriver",
       "~/Downloads/sqljdbc_4.0/enu/sqljdbc4.jar")
} else if(Sys.info()['sysname'] == "Windows"){
  JDBC("com.microsoft.sqlserver.jdbc.SQLServerDriver",
            "C:\\Program Files\\Microsoft JDBC Driver 4.0 for SQL Server\\sqljdbc_4.0\\enu\\sqljdbc4.jar")
} else {
  error("Sorry: your platform is not currently supported.")
}

Sys.info() is part of the base package and does not incur additional dependencies. "Darwin" refers to the old open-source variant of OS Xโ€”I don't believe it would work for other *nix systems.

Implement sql_select method for SQLServerConnection

The default sql_select() method that is called when dispatched on a SQLServerConnection object is the one defined for DBIConnection. However, this permits passing a limit argument which is unsupported by SQL Server (see #17). As a result, I need to implement the sql_select() method for the SQLServerConnection object that asserts that the limit argument is invalid and that optionally should use TOP argument or OFFSET and FETCH combinations. See SO thread

dplyr::arrange method fails

The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless TOP is also specified.

dplyr arrange method fails

The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless TOP is also specified.

sqlserver type bit will be transformed to char

Hi, firstly, thanks for this great package. It helps us a lot.

However, I notice that the type bit in sqlserver will be transformed to char type. I think it should be integer, because the value is 1 or 0.

Is it a bug or is something you intended to do?

By the way, the date and datetime type always return as char, too. Can it be identified as date or datetime?

Thanks!

dbSendQuery expects result set, returns error although statement succeeds

It looks like dbSendQuery is expecting a result set, although per the DBI docs, it isn't supposed to. While the SQL executes on the database correctly, the function returns an error.

Here's a minimal example where I create a schema and a table, insert values into it, and then read the table back into R.

con <- dbConnect(SQLServer(), ....)
dbSendQuery(con, "create schema sc")
dbSendQuery(con, "create table sc.table1 (
    n int,
    x real
)")
dbSendQuery(con, "insert into sc.table1 values
    (1, 2.5),
    (2, 0.0),
    (3, 3.14159)
")
dbReadTable(con, "sc.table1")

All the dbSendQuery calls return errors like

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for create table sc.table1 (
n int,
x real
) (The executeQuery method must return a result set.)

but the final dbReadTable shows that the table was created anyway:

  n       x
1 1 2.50000
2 2 0.00000
3 3 3.14159

db_create_table and db_insert_into need modification

Both methods fail. Reason is that dbSendQuery expects a ResultSet however both functions' underlying statements do not return a ResultSet. RJDBC expects statements running through dbSendQuery to return ResultSets and provides a different non-DBI method for statements that do not (dbSendUpdate). Related to #26

copy_to not work

here is the error message

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for BEGIN TRANSACTION (The executeQuery method must return a result set.)

Determine dbListTables behaviour

Seems like JDBC behaviour is to return all sorts of tables including table types such as: "TABLE", "VIEW", "SYSTEM TABLE", "GLOBAL TEMPORARY", "LOCAL TEMPORARY", "ALIAS", "SYNONYM".

In particular, should dbListTables return system tables? DBI help doc only states that:

This should, where possible, include temporary tables.

However as described in #29 this isn't possible using JDBC.

Behaviour for other DB backends:

  1. SQLite: returns temporary and non-temporary TABLEs and VIEWs
  2. MySQL: returns "non-temporary" tables because dbListTables() executes the SQL statement SHOW TABLES
  3. PostgreSQL: returns non-system tables

Use working example in README

See comment to #3.

@jangorecki: are you be able to send through a pull request per your suggestion. I would suggest reworking the README file such that it makes use of the AdventureWorks2012 Azure hosted DB.

db.version not being extract in dbGetInfo method for SQLServerConnection

Currently reads:

setMethod(f = 'dbGetInfo', signature = 'SQLServerConnection',
  definition = function (dbObj, ...) {
    meta <- rJava::.jcall(dbObj@jc, "Ljava/sql/DatabaseMetaData;",
      "getMetaData")
    list(db.product.name = rJava::.jcall(meta, "S", "getDatabaseProductName"),
      db.version = rJava::.jcall(meta, "I", "getDatabaseMajorVersion"),
      user = rJava::.jcall(meta, "S","getUserName"))
  }
)

Master branch fails install

devtools::install_github('imanuelcostigan/RSQLServer')

Downloading github repo imanuelcostigan/RSQLServer@master
Installing RSQLServer
"C:/R/R-31~1.2/bin/x64/R" --vanilla CMD INSTALL  \
  "C:/Users/xxx/AppData/Local/Temp/Rtmpw1Jy5f/devtools7287d683aae/imanuelcostigan-RSQLServer-6453ff7"  \
  --library="C:/R/R-3.1.2/library" --install-tests 

Loading required package: bitops
* installing *source* package 'RSQLServer' ...
** R
Error in .install_package_code_files(".", instdir) : 
files in 'C:/Users/xxx/AppData/Local/Temp/Rtmpw1Jy5f/devtools7287d683aae/imanuelcostigan-RSQLServer-6453ff7/R' missing from 'Collate' field:
  src-sqlserver.R
ERROR: unable to collate and parse R files for package 'RSQLServer'
* removing 'C:/R/R-3.1.2/library/RSQLServer'
* restoring previous 'C:/R/R-3.1.2/library/RSQLServer'
Error: Command failed (1)

Maybe I should try installing the dplyr branch?

Switch to using jTDS

Seems to support SQL Server 2000 >=. MS JDBC driver backward compatibility is messy

build wrong sql in db_query_fields

code for building sql in db_query_fields.SQLServerConnection is

fields <- build_sql("SELECT * FROM ", dplyr::ident(sql),
" WHERE 0=1", con = con)

dplyr::ident is used to add "ident" sub-class for sql, but as sql is not always an object of class ident, it may just be an ordinary object of class sql, the dplyr::ident will add abundant quotations to the sql built, like select * from "(some sql)" where 0=1

Implement head.tbl_sqlserver

Default method head.tbl_sql() calls build_query() with the limit argument. This argument is unsupported by SQL Server is not defined in the ANSI SQL spec but is a reserved word. So I will need to work around this by implementing a specific method for head() for SQL Server connections.

sql_select will add "ORDER BY NULL" clause

the code processing order_by in sql_select.SQLServerConnection is

if (is.null(order_by)) {
if (!is.null(top)) {
# MS best practice: use ORDER BY clause when using TOP clause
# This is the only way to predictably indicate which rows are affected by
# TOP.
# Source: http://msdn.microsoft.com/en-us/library/ms189463.aspx
out$order_by <- dplyr::build_sql("ORDER BY ",
dplyr::escape(order_by, collapse = ", ", con = con))
}
} else {
assertthat::assert_that(is.character(order_by), length(order_by) > 0L)
out$order_by <- dplyr::build_sql("ORDER BY ",
dplyr::escape(order_by, collapse = ", ", con = con))
}

when order_by is null and top is not null, out$order_by will be "ORDER BY NULL", which wil eventually get an error from sql server

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.