r-hyperspec / hyspc.read.txt Goto Github PK
View Code? Open in Web Editor NEWImport ASCII formatted data into hyperSpec
Home Page: https://r-hyperspec.github.io/hySpc.read.txt/
License: MIT License
Import ASCII formatted data into hyperSpec
Home Page: https://r-hyperspec.github.io/hySpc.read.txt/
License: MIT License
For functions, that return hyperSpec
object, use link to hyperSpec
class as follows:
#' @return [hyperSpec][hyperSpec::hyperSpec-class()] object.
In read_*()
functions, we currently use file
, files
, filename
or con
to indicate the path to file or connection. We should use file
consistently. Update argument name in these functions:
NOTE: some functions use specific argument names like filex
. Please, mention functions like this by commenting on this issue. We should decide on the names of these arguments separately.
Some error/warning messages refer to hypseSpec
while they should refer to hySpc.read.txt
. This should be fixed.
Good start is to fix messages with pattern:
packageDescription("hyperSpec")
This issue could be potentially be raised several places, but it's going here for now.
As this package is reaching a completion point in the very near future, it's a good time to update fileio.Rmd
that resides in hyperSpec
(I will mention this issue over there). I'm assigning to Erick as his fresh eyes will be a real advantage.
hySpc.read.txt
appropriately so it is current.read_txt_Shimadzu()
needs overhaul:
read_txt_Shimadzu()
working with the new UV-VIS-NIR example spectrumThis is a copy of the issue cbeleites/hyperSpec#67
Related:
This is a copy of the issue cbeleites/hyperSpec#67
@cbeleites the gh-pages
branch is updated correctly now. So, could you, please, enable GitHub pages for r-hyperspec/hySpc.read.Witec
and add the link of pkgdown website to the description of the repo.
In DESCRIPTION
, use title case otherwise it will not pacc CRAN checks:
Title: File import Functions for Spectra in various ASCII/text file formats
There are non-ASCII files in directories spc.Kaisermap
(needs to go to hySpc.read.spc) and spc.Witec
(several file formats, including ASCII which should be kept but directory renamed accordingly).
This should be tackled only after the PRs #44 and #45 are merged.
spc.Kaisermap/
, please?spc.Witec
and notify me how far you get?On my machine, it takes arround 2 minutes to read GCxGC-qMS.txt
file in read_txt_Shimadzu()
unit test:
system.time({
filename <- system.file(
"extdata",
"txt.Shimadzu/GCxGC-qMS.txt",
package = "hySpc.read.txt"
)
spc <- read_txt_Shimadzu(filename)
})
#> user system elapsed
#> 124.50 17.56 143.90
143.90/60 ≈ 2.398 min
It is way too long.
Either the function is too slow or the file is too large. What should we do about this, @sangttruong, @bryanhanson, @cbeleites?
Should we skip this test for now?
Related:
As far as I can see, the following issues present in hyperSpec
should be addressed here at this time.
Please close over there as you tick them off.
In addition to current unit tests, add template #55 for these functions:
(the current test should remain in the files)
This is the copy of cbeleites/hyperSpec#80
$SpectrumHeader$XDataKind
I believe this is what @GegznaV mentioned during the video call today. I will assign to Erick as he's the one must up to date on this repo.
Rename all the functions like read.txt.Witec()
into read_txt_Witec()
. In particular, function names should be the same as filenames that contain those functions, e.g., in https://github.com/r-hyperspec/hySpc.read.Witec/blob/develop/R/read_txt_Witec.R
https://github.com/r-hyperspec/hySpc.read.Witec/blob/16a57f09b1c149e43d3f31d9284f740bc808e825/R/read_txt_Witec.R#L24
There are several not renamed functions including read.mat.Witec()
, read.dat.Witec()
, and read.txt.Witec()
.
There are some messages left that encourage users to contact the maintainer in case of an error or bug. They usually contain code:
maintainer("hyperSpec")
or similar. A reference to open a GH issue in this (and not `hyperSpec's) repo should replace the suggestion to contact the maintainer.
This issue is to continue discussion that started as:
The original message:
This is the code from fileio.Rmd
which could be used to create unit tests: fileio-for-unit-tests.zip as stated in
untangle historic mix with unit tests into vignette and unit tests
# ============================================================================
# =========== R code extracted from fileio.Rmd vignette ======================
# ============================================================================
## ----setup-io, include=FALSE---------------------------------------------------------
# Packages -------------------------------------------------------------------
library(hyperSpec)
library(R.matlab)
## ----array---------------------------------------------------------------------------
data <- array(1:24, 4:2)
wl <- c(550, 630)
x <- c(1000, 1200, 1400)
y <- c(1800, 1600, 1400, 1200)
data
## ----array-import--------------------------------------------------------------------
d <- dim(data)
dim(data) <- c(d[1] * d[2], d[3])
x <- rep(x, each = d[1])
y <- rep(y, d[2])
spectra <- new("hyperSpec",
spc = data,
data = data.frame(x, y), wavelength = wl
)
## ------------------------------------------------------------------------------------
y <- seq_len(d[1])
x <- seq_len(d[2])
## ----readcollapse--------------------------------------------------------------------
files <- Sys.glob("fileio/spc.Kaisermap/*.spc")
files <- files[seq(1, length(files), by = 2)] # import low wavenumber region only
spc <- lapply(files, read.spc)
length(spc)
spc[[1]]
spc <- collapse(spc)
spc
## ----read.txt.t----------------------------------------------------------------------
file <- read.table("fileio/txt.t/Triazine 5_31.txt", header = TRUE, dec = ",", sep = "\t")
triazine <- new("hyperSpec",
wavelength = file[, 1], spc = t(file[, -1]),
data = data.frame(sample = colnames(file[, -1])),
labels = list(
.wavelength = expression(2 * theta / degree),
spc = "I / a.u."
)
)
triazine
## ----plot-triazine, fig.cap=CAPTION--------------------------------------------------
plot(triazine[1])
## ------------------------------------------------------------------------------------
read.jdx("fileio/jcamp-dx/shimadzu.jdx", encoding = "latin1", keys.hdr2data = TRUE)
read.jdx("fileio/jcamp-dx/virgilio.jdx")
## ------------------------------------------------------------------------------------
read.jdx("fileio/jcamp-dx/virgilio.jdx", ytol = 1e-9)
## ----nist-aes------------------------------------------------------------------------
file <- readLines("fileio/NIST/mercurytable2.htm")
# file <- readLines("http://physics.nist.gov/PhysRefData/Handbook/Tables/mercurytable2.htm")
file <- file[-(1:grep("Intensity.*Wavelength", file) - 1)]
file <- file[1:(grep("</pre>", file) [1] - 1)]
file <- gsub("<[^>]*>", "", file)
file <- file[!grepl("^[[:space:]]+$", file)]
colnames <- file[1]
colnames <- gsub("[[:space:]][[:space:]]+", "\t", file[1])
colnames <- strsplit(colnames, "\t")[[1]]
if (!all(colnames == c("Intensity", "Wavelength (Å)", "Spectrum", "Ref. "))) {
stop("file format changed!")
}
tablestart <- grep("^[[:blank:]]*[[:alpha:]]+$", file) + 1
tableend <- c(tablestart[-1] - 2, length(file))
tables <- list()
for (t in seq_along(tablestart)) {
tmp <- file[tablestart[t]:tableend[t]]
tables[[t]] <- read.fwf(textConnection(tmp), c(5, 8, 12, 15, 9))
colnames(tables[[t]]) <- c("Intensity", "persistent", "Wavelength", "Spectrum", "Ref. ")
tables[[t]]$type <- gsub("[[:space:]]", "", file[tablestart[t] - 1])
}
tables <- do.call(rbind, tables)
levels(tables$Spectrum) <- gsub(" ", "", levels(tables$Spectrum))
Hg.AES <- list()
for (s in levels(as.factor(tables$Spectrum))) {
Hg.AES[[s]] <- new("hyperSpec",
wavelength = tables$Wavelength[tables$Spectrum == s],
spc = tables$Intensity[tables$Spectrum == s],
data = data.frame(Spectrum = s),
label = list(
.wavelength = expression(lambda / ring(A)),
spc = "I"
)
)
}
## ----plot-hg-aes, fig.cap=CAPTION----------------------------------------------------
plot(collapse(Hg.AES), lines.args = list(type = "h"), col = 1:2)
## ------------------------------------------------------------------------------------
library(R.matlab)
## ----eval=FALSE----------------------------------------------------------------------
## **V. Gegznas's notes**:
## FIXME:`file `spectra.mat` is not present.
spc.mat <- readMat("fileio/spectra.mat")
## ----eval=FALSE----------------------------------------------------------------------
## **V. Gegznas's notes**:
## FIXME: there are issues in downloading `Rcompression` package as omegahat.org
## does not update as quickly as a new version of R is released.
install.packages("Rcompression", repos = "http://www.omegahat.org/R")
## ----read.mat.Cytospec-blocks--------------------------------------------------------
read.mat.Cytospec("fileio/mat.cytospec/cytospec.mat", blocks = TRUE)
## ----read.mat.Cytospec---------------------------------------------------------------
read.mat.Cytospec("fileio/mat.cytospec/cytospec.mat", blocks = 1)
## ----read.ENVI-----------------------------------------------------------------------
spc <- read.ENVI("fileio/ENVI/example2.img")
spc
## ------------------------------------------------------------------------------------
read.spc("fileio/spc.Kaisermap/ebroAVII.spc", keys.hdr2data = TRUE)
## ------------------------------------------------------------------------------------
read.spc("fileio/spc.Kaisermap/ebroAVII.spc", keys.log2data = TRUE)
## ----read.spc.list-old---------------------------------------------------------------
barbiturates <- read.spc("fileio/spc/BARBITUATES.SPC")
## ------------------------------------------------------------------------------------
class(barbiturates)
## ------------------------------------------------------------------------------------
length(barbiturates)
## ------------------------------------------------------------------------------------
barbiturates <- collapse(barbiturates, collapse.equal = FALSE)
barbiturates
## ------------------------------------------------------------------------------------
barbiturates[[1:10, , 25 ~ 30]]
## ----eval=FALSE----------------------------------------------------------------------
header <- list(
samples = 64 * no.images.in.row,
lines = 64 * no.images.in.column,
bands = no.data.points.per.spectrum,
`data type` = 4,
interleave = "bip"
)
## ----readENVINicolet-----------------------------------------------------------------
spc <- read.ENVI.Nicolet("fileio/ENVI/example2.img", nicolet.correction = TRUE)
spc ## dummy sample with all intensities zero
## ----Kaiser.txt.comma----------------------------------------------------------------
## 1. import as character
tmp <- scan("fileio/txt.Kaiser/test-lo-4.txt", what = rep("character", 4), sep = ",")
tmp <- matrix(tmp, nrow = 4)
## 2. concatenate every two columns by a dot
wl <- apply(tmp[1:2, ], 2, paste, collapse = ".")
spc <- apply(tmp[3:4, ], 2, paste, collapse = ".")
## 3. convert to numeric and create hyperSpec objectne, though).
spc <- new("hyperSpec", spc = as.numeric(spc), wavelength = as.numeric(wl))
spc
## ----readspcKaiserMap----------------------------------------------------------------
files <- Sys.glob("fileio/spc.Kaisermap/*.spc")
spc.low <- read.spc.KaiserMap(files[seq(1, length(files), by = 2)])
spc.high <- read.spc.KaiserMap(files[seq(2, length(files), by = 2)])
wl(spc.high) <- wl(spc.high) + 1340
spc
## ----read.txt.Renishaw---------------------------------------------------------------
paracetamol <- read.txt.Renishaw("fileio/txt.Renishaw/paracetamol.txt", "spc")
paracetamol
## ------------------------------------------------------------------------------------
read.txt.Renishaw("fileio/txt.Renishaw/laser.txt.gz", data = "ts")
## ----read.txt.Renishaw-file----------------------------------------------------------
read.txt.Renishaw("fileio/txt.Renishaw/chondro.txt", nlines = 1e5, nspc = 875)
## ----read.txt.Renishaw-compressed, results='hide'------------------------------------
read.txt.Renishaw("fileio/txt.Renishaw/chondro.gz")
read.txt.Renishaw("fileio/txt.Renishaw/chondro.xz")
read.txt.Renishaw("fileio/txt.Renishaw/chondro.lzma")
read.txt.Renishaw("fileio/txt.Renishaw/chondro.gz")
read.txt.Renishaw("fileio/txt.Renishaw/chondro.bz2")
read.zip.Renishaw("fileio/txt.Renishaw/chondro.zip")
## ----Horiba--------------------------------------------------------------------------
spc <- read.txt.Horiba("fileio/txt.HoribaJobinYvon/ts.txt",
cols = list(
t = "t / s", spc = "I / a.u.",
.wavelength = expression(Delta * tilde(nu) / cm^-1)
)
)
spc
## ----testHoriba-1--------------------------------------------------------------------
spc <- read.txt.Horiba.xy("fileio/txt.HoribaJobinYvon/map.txt")
if (any(dim(spc) != c(141, 4, 616)) ||
any(abs(spc) < .Machine$double.eps^.5) ||
is.null(spc$x) || any(is.na(spc$x)) ||
is.null(spc$y) || any(is.na(spc$y)) ||
length(setdiff(wl(spc), 1:616)) == 0L) {
stop("error in testing read.txt.Horiba.xy. Please contact ", maintainer("hyperSpec"))
}
spc
## ----testHoriba-2--------------------------------------------------------------------
spc <- read.txt.Horiba.t("fileio/txt.HoribaJobinYvon/ts.txt")
if (any(dim(spc) != c(100, 3, 1024)) ||
is.null(spc$t) || any(is.na(spc$t)) ||
length(setdiff(wl(spc), 1:1024)) == 0L) {
stop("error in testing read.txt.Horiba.xy. Please contact ", maintainer("hyperSpec"))
}
spc
rm(spc)
## ------------------------------------------------------------------------------------
read.asc.Andor("fileio/asc.Andor/ASCII-Andor-Solis.asc")
## ----witec-spc, results='hide'-------------------------------------------------------
read.spc("fileio/spc.Witec/Witec-timeseries.spc")
read.spc("fileio/spc.Witec/Witec-Map.spc")
## ----witec-dat, results='hide'-------------------------------------------------------
read.dat.Witec("fileio/txt.Witec/Witec-timeseries-x.dat")
read.dat.Witec(
filex = "fileio/txt.Witec/Witec-Map-x.dat",
points.per.line = 5, lines.per.image = 5, type = "map"
)
## ----witec-txt, include=FALSE--------------------------------------------------------
read.txt.Witec("fileio/txt.Witec/Witec-timeseries_no.txt")
## ----witec-txt-textfiles, include=FALSE----------------------------------------------
headline <- c(
"with exported labels and units headerlines:",
"\nwith exported labels headerline:",
"\nwith exported units headerline:",
"\nwithout headerline:"
)
files <- c(
"fileio/txt.Witec/Witec-timeseries_full.txt",
"fileio/txt.Witec/Witec-timeseries_label.txt",
"fileio/txt.Witec/Witec-timeseries_unit.txt",
"fileio/txt.Witec/Witec-timeseries_no.txt"
)
for (f in seq_along(files)) {
cat(headline[f], "\n")
tmp <- format(as.matrix(read.table(files[f], sep = "\t")[1:4, 1:3]))
apply(tmp, 1, function(l) cat(l, "\n"))
}
## ----witec-txt-map, include=FALSE----------------------------------------------------
read.txt.Witec("fileio/txt.Witec/Witec-Map_full.txt",
type = "map", hdr.label = TRUE, hdr.units = TRUE)
read.txt.Witec("fileio/txt.Witec/Witec-Map_label.txt",
type = "map", hdr.label = TRUE, hdr.units = FALSE)
read.txt.Witec("fileio/txt.Witec/Witec-Map_unit.txt",
type = "map", hdr.label = FALSE, hdr.units = TRUE)
read.txt.Witec("fileio/txt.Witec/Witec-Map_unit.txt",
type = "map", hdr.label = FALSE, hdr.units = TRUE,
points.per.line = 5
)
read.txt.Witec("fileio/txt.Witec/Witec-Map_no.txt",
type = "map", hdr.label = FALSE, hdr.units = FALSE)
read.txt.Witec("fileio/txt.Witec/Witec-Map_no.txt",
type = "map", hdr.label = FALSE, hdr.units = FALSE,
lines.per.image = 5
)
read.txt.Witec("fileio/txt.Witec/Witec-Map_no.txt",
type = "map", hdr.label = FALSE, hdr.units = FALSE,
points.per.line = 5, lines.per.image = 5
)
## ----witec-txt-Graph, results='hide'-------------------------------------------------
read.txt.Witec.Graph("fileio/txt.Witec/Witec-timeseries (Header).txt")
read.txt.Witec.Graph("fileio/txt.Witec/Witec-Map (Header).txt", type = "map")
read.txt.Witec.Graph("fileio/txt.Witec/nofilename (Header).txt", encoding = "latin1")
## ----comment="", eval=TRUE, echo=FALSE, class.output="add-border sourceCode r"-------
writeLines(readLines("read.txt.PerkinElmer.R"))
## ----read.txt.PerkinElmer, message=FALSE---------------------------------------------
source("read.txt.PerkinElmer.R")
read.txt.PerkinElmer(Sys.glob("fileio/txt.PerkinElmer/flu?.txt"), skip = 54)
Please, close tis issue when it gets not relevant.
Review and update vignette hySpc-read-txt.Rmd
to match the current version of the package.
Why is 0temp_fileio_optional.R
needed? It seems to me like a wrong way to import function from another package.
Some files are in tests/testthat/
others are in inst/extdata/
directory.
Is this on purpose or accidentally?
As I understand, files in inst/extdata/
can be used in both tests and examples but count to the size of the installed R package.
And what's about tests/testthat/
: do files in this directory count to the size of the installed package?
R finds these files when devtools::test()
is called but there is an issue to run tests manually.
Test whether the GHA script can be simplified to deploy to just the master branch. This is issue r-hyperspec/pkg-repo#3
Change the name of the package to reflect future enhancements (i.e., reading in various .txt files)
For #27, I get this issue locally
Improve unit tests for read_txt_Renishaw()
by using template from #55. For details see:
It is enough to do that with paracetamol
data/file only.
To automatically deploy the results of building and checking, we need to:
pkgdown
hySpc.read.Witec_xxx.tar.gz
over to hySpc.pkgs
I've researched this some, here's a few resources I've found:File microbenchmark.R
does not contain any used code. It should be removed.
Test extra data columns t
, x
, y
, etc. in functions:
Related:
I suggest this basic template for unit tests of read_*()
functions. In each situation, function/file-format-specific unit tests should be added. Written as RStudio snippet:
snippet hut
# Unit tests -----------------------------------------------------------------
hySpc.testthat::test(${1:function_to_test}) <- function() {
local_edition(3)
filename <- system.file(
"extdata",
"${2:path_to_file_to_read}",
package = "hySpc.read.txt"
)
expect_silent(spc <- ${1:function_to_test}(filename))
n_wl <- nwl(spc)
n_rows <- nrow(spc)
n_clos <- ncol(spc)
test_that("${3:file format}: hyperSpec obj. dimensions are correct", {
expect_equal(n_wl, ___)
expect_equal(n_rows, ___)
expect_equal(n_clos, ___)
})
test_that("${3:file format}: extra data are correct", {
# @data colnames
expect_equal(colnames(spc), c("spc", "filename", ___))
# @data values
# (Add tests, if relevant or remove this row)
})
test_that("${3:file format}: labels are correct", {
expect_equal(spc@label\$.wavelength, NULL)
expect_equal(spc@label\$spc, NULL)
expect_equal(spc@label\$filename, "filename")
})
test_that("${3:file format}: spectra are correct", {
# Dimensions of spectra matrix (@data$spc)
expect_equal(dim(spc@data\$spc), c(___, ___))
# Column names of spectra matrix
expect_equal(colnames(spc@data\$spc)[1], "___")
expect_equal(colnames(spc@data\$spc)[10], "___")
expect_equal(colnames(spc@data\$spc)[n_wl], "___") # last name
# Values of spectra matrix
expect_equal(unname(spc@data\$spc[1, 1]), ___)
expect_equal(unname(spc@data\$spc[2, 10]), ___)
expect_equal(unname(spc@data\$spc[n_rows, n_wl]), ___) # last spc value
})
test_that("${3:file format}: wavelengths are correct", {
expect_equal(spc@wavelength[1], ___)
expect_equal(spc@wavelength[10], ___)
expect_equal(spc@wavelength[n_wl], ___)
})
}
This is how this template would look like for read_asc_Andor()
function:
# Unit tests -----------------------------------------------------------------
hySpc.testthat::test(read_asc_Andor) <- function() {
local_edition(3)
filename <- system.file(
"extdata",
"asc.Andor/ASCII-Andor-Solis.asc",
package = "hySpc.read.txt"
)
expect_silent(spc <- read_asc_Andor(filename))
n_wl <- nwl(spc)
n_rows <- nrow(spc)
n_clos <- ncol(spc)
test_that("Andor Solis .asc: hyperSpec obj. dimensions are correct", {
expect_equal(n_wl, 63)
expect_equal(n_rows, 5)
expect_equal(n_clos, 2)
})
test_that("Andor Solis .asc: extra data are correct", {
# @data colnames
expect_equal(colnames(spc), c("spc", "filename"))
# @data values
# (Add tests, if relevant or remove this row)
})
test_that("Andor Solis .asc: labels are correct", {
expect_equal(spc@label$.wavelength, NULL)
expect_equal(spc@label$spc, NULL)
expect_equal(spc@label$filename, "filename")
})
test_that("Andor Solis .asc: spectra are correct", {
# Dimensions of spectra matrix (@data$spc)
expect_equal(dim(spc@data$spc), c(5, 63))
# Column names of spectra matrix
expect_equal(colnames(spc@data$spc)[1], "161.408")
expect_equal(colnames(spc@data$spc)[10], "200.184")
expect_equal(colnames(spc@data$spc)[n_wl], "423.651") # last name
# Values of spectra matrix
expect_equal(unname(spc@data$spc[1, 1]), 3404)
expect_equal(unname(spc@data$spc[2, 10]), 3405)
expect_equal(unname(spc@data$spc[n_rows, n_wl]), 3415) # last spc value
})
test_that("Andor Solis .asc: wavelengths are correct", {
expect_equal(spc@wavelength[1], 161.40845)
expect_equal(spc@wavelength[10], 200.18387)
expect_equal(spc@wavelength[n_wl], 423.65106)
})
}
What is your opinion, @sangttruong, @bryanhanson, @cbeleites? Is this template OK for you?
move read.mat.Witec()
to hySpc_read_mat
There are tests in other files than read_ini.R
that call the read_ini()
function. But it would be good to have tets in the same file where the function is defined.
Update all functions and syntax in unit tests to be compatible with testthat
3ed.
==> devtools::test()
Loading hySpc.read.txt
[...snip...]
Testing hySpc.read.txt
✓ | OK F W S | Context
⠋ | 0 1 |
══ Results ═══════════════════════════════════════════════════════════════════════════
OK: 0
Failed: 0
Warnings: 0
Skipped: 1
ff18032 had 68 tests that were passing (locally).
Review, fix, add and improve unit tests for these functions and their helpers:
In
file <- hyperSpec::read.ini(paste0(tmpdir, "/Witec_TrueMatch.txt"))
and in other unit tests that use read.ini()
(deprecated function from hyperSpec
), function read_ini()
must be used.
In functions that read from ASCII files (some of them are not moved to this package yet, see cbeleites/hyperSpec#263) the authors of those functions (that are not C. Beleites) are mentioned. These people must apear in the list of contributors of this package.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.