Giter VIP home page Giter VIP logo

sss's Introduction

sss

CRAN status CRAN RStudio mirror downloads R-CMD-check Codecov test coverage Lifecycle: stable

The sss package provides functions to import triple-s XML files into R. The package supports sss files in both .asc and .csv format.

Installation

You can install the released version of sss from CRAN using:

install.packages("sss")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("andrie/sss")

System dependencies

A previous version of this package imported the XML package, but from version 0.1 the package imports xml2. The xml2 package depends on the libxml2 library. If you run your code on linux, you may have to manually install libxml2:

  • libxml2-dev (Debian, Ubuntu)
  • libxml2-devel (Red Hat, CentOS, Fedora)

The triple-s standard

triple-s is a standard to transfer survey data between applications.

Example

This is a basic example which shows you how to solve a common problem:

library(sss)
## basic example code

sss's People

Contributors

andrie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sss's Issues

self closing tags on values cause issues

The triple-s files I get from our survey programming team sometimes have placeholder values programmed... they've reserved some values, but there is no label. Those come through in the xml as self-closing tags. This causes problems with the function getSSScodes.

I think I've convinced myself that this is actually a problem in the dependency xml2. It doesn't seem to handle the self closing tags. I've put together an example xml snippet, that if put through the relevant lines within getSSScodes reproduces the error I'm seeing.

x<-xml2::read_xml("<variable ident='412' type='single' formname='GotHere' formlabel='GotHere' formtype='single' source='p407369808' routingid='352' fieldwidth='2'>
  <name>GotHere</name>
  <label>GotHere</label>
  <position start='73553' finish='73554'/>
  <values> 
    <value code='1'>INTRO PAGE</value>
    <value code='2'>SCREENER PAGE</value>
    <value code='3'>CONTENT PAGE</value>
    <value code='4'>DEMO PAGE</value>
    <value code='5'>FOLLOW UP PAGE</value>
    <value code='6'/>
    <value code='7'/>
    <value code='8'/>
    <value code='9'/>
  </values>
  </variable>")

xx<-xml2::xml_find_all(x,"values/value")
size<-length(xx)

data.frame(ident = rep(unname(xml2::xml_attr(x, "ident")), 
                       size), code = as.character(xml2::xml_attrs(xx)), codevalues = as.character(xml2::xml_contents(xx)), 
           stringsAsFactors = FALSE)

I get the error: Error in data.frame(ident = rep(unname(xml2::xml_attr(x, "ident")), size), : arguments imply differing number of rows: 9, 5

Even worse, if the xml happened to have the right number of placeholder tags, the content could be recycled without error or warning:


x2<-xml2::read_xml("<variable ident='412' type='single' formname='GotHere' formlabel='GotHere' formtype='single' source='p407369808' routingid='352' fieldwidth='2'>
  <name>GotHere</name>
  <label>GotHere</label>
  <position start='73553' finish='73554'/>
  <values> 
    <value code='1'>INTRO PAGE</value>
    <value code='2'>SCREENER PAGE</value>
    <value code='3'>CONTENT PAGE</value>
    <value code='4'>DEMO PAGE</value>
    <value code='5'>FOLLOW UP PAGE</value>
    <value code='6'/>
    <value code='7'/>
    <value code='8'/>
    <value code='9'/>
    <value code='10'/>
  </values>
  </variable>")

xx2<-xml2::xml_find_all(x2,"values/value")
size<-length(xx2)

data.frame(ident = rep(unname(xml2::xml_attr(x2, "ident")), 
                       size), code = as.character(xml2::xml_attrs(xx2)), codevalues = as.character(xml2::xml_contents(xx2)), 
           stringsAsFactors = FALSE)

Produces the following dataframe with recycled labels:

   ident code     codevalues
1    412    1     INTRO PAGE
2    412    2  SCREENER PAGE
3    412    3   CONTENT PAGE
4    412    4      DEMO PAGE
5    412    5 FOLLOW UP PAGE
6    412    6     INTRO PAGE
7    412    7  SCREENER PAGE
8    412    8   CONTENT PAGE
9    412    9      DEMO PAGE
10   412   10 FOLLOW UP PAGE

label.table

Thanks for this Package !

Is it possibile to modifiy read.sss function so that it returns not only the attribute variable.labels but also an attribute assoicated with the codes (this attributes could be name label.table) ?*

For instance, we should have

attributes(d)$variable.labels
 [1] "Number of visits"                  "Attractions visited"               "Attractions visited"              
 [4] "Attractions visited"               "Attractions visited"               "Attractions visited"              
 [7] "Attractions visited"               "Attractions visited"               "Attractions visited"              
[10] "Attractions visited"               "Other attractions visited"         "Two favourite attractions visited"
[13] "Two favourite attractions visited" "Miles travelled"                   "Would come again"                 
[16] "When is that most likely to be"    "Case weight"    

But also attributes(d)$label.tableshould be a list (each element should be NULL or a name vector), with for instance

attributes(d)$label.table[[1]]
        Sherwood Forest       Nottingham Castle "Friar Tuck" Restaurant      "Maid Marion" Cafe           Mining museum 
                      1                       2                       3                       4                       5 
                  Other 
                      9 

Thanks !

Dominique

Write SSS data

Hello,

would it be possible to implement a function that writes a triple-s data file?

Best and thanks,
Robert

Problem importing sss files

Hi thank you for the package, as I try to understand how it works, I am having problems importing. I have a sss and csv files with same filename. I have used the read.sss() function. R creates the data frame and the variables seems to be there but for example the number of observations is not correct. If I try to view the dataframe in R studio view() I receive an error such as "no data in column ...".
I have tries with the sample files inside the sss package and it works fine.
I have also tried with the files downloaded from the triple-s offial website and I receive the same error. You could use the latter files to reproduce my error.
I am using
R 4 and R studio 2022.02.3 and sss version 0.2.1 and xml2 version 1.3.3

Thank you for any help you could provide

Data label issue with <text> modules in xml code

Hello,
thank you for this great package.
I figured a problem with parsing the metadata for my triple-s file. It is just affecting the data labels for my variables. My xml code looks like this for variable declaration:

...
<variable ident="5" type="single" format="numeric">
<name>typtel</name>
<label>
<text xml:lang="fr-FR" mode="analysis">TYPTEL-Telephone Type</text>
</label>
<position start="16" />
<values>
<value code="1">Fixed<text xml:lang="fr-FR" mode="analysis">Fixed</text></value>
<value code="2">Mobile<text xml:lang="fr-FR" mode="analysis">Mobile</text></value>
</values>
</variable>
...

I can parse the xml code, but in R, I can see that this in fact returns a code dataframe with invalid entries:

ident code codevalues
5 1 Fixed
5 2 <text xml:lang="fr-FR" mode="analysis">Fixed</text>
5 1 Mobile
5 2 <text xml:lang="fr-FR" mode="analysis">Mobile</text>

Is there a way to solve this with the package, without having to delete all of the <text> entries (which gives the right results?)

Thank you for your help
Julia

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.