Giter VIP home page Giter VIP logo

scio-read's Issues

Data from the literature on the SCiO spectrometer

I have glanced through all discussions about the raw data. I am surprised that nobody cited the US patent (https://patents.google.com/patent/US9377396B2) describing the main operating principle of the SCiO spectrometer. It is not new; I read it several years ago. According to the patent, the device works as follows:

  • diffuse light impinges upon an optical filter which has a relatively narrow bandpass (about 27nm);
  • a part of the light passes through the filter, then through a convex lens, and finally through a micro-lens array;
  • as a result, a series of circles appear on a CMOS matrix (each circle corresponds to a particular wavelength);

The SCiO spectrometer has 12 filters and 12 independent regions on the CMOS matrix. So, it is no surprise that 12 · 27 ≈ 331.

Knowing the main principle of operating one can assume that a file with the raw data is an image, not a spectrum. However, the patent mentioned above says nothing about the data structure. I decided to search deeper. I came across another US patent (https://patents.google.com/patent/US10330531B2). It confirms my guess. The bad news is that the raw data are both compressed and encrypted: "… the compressed encrypted raw data signal can be transmitted via Bluetooth to the handheld device. Compression of raw data may be necessary since raw intensity data will generally be too large to transmit via Bluetooth in real time. … The data generated by the optical system described herein typically contains symmetries that allow significant compression of the raw data into much more compact data structures". How is compression performed? Is the whole image zipped? Or only some parts? Can some data processing (e.g., averaging) also be involved?

The encrypted data are sent from the SCiO to a smartphone. Then the data are redirected to the server without any transformation: "The encrypted, compressed raw data signal from the spectrometer may be received by the UI of the handheld device … The UI may then transmit the data to the cloud server. … The cloud server can receive compressed, encrypted data and/or metadata from the handheld device. A processor or communication interface of the cloud server can then decrypt the data, and a digital signal processing unit of the cloud server can perform signal processing on the decrypted signal to transform the signal into spectral data". Experimental data obtained by @kebasaa (see the 01_rawdata/log_extracted/ folder) confirm this statement from the patent.

Everything that has been said above applies to the following data: sample, sample_dark, sample_white, and sample_white_dark. One can assume that sample_dark and sample_white_dark are recorded with turned-off LED and are used to estimate the dark current. It is clear (see the 1_rawdata/app_researcher_output/SCIO_scans_from_tech_support.csv file) that data stored in sample_white are needed to normalize the final spectrum. Unfortunately, I do not currently have any idea about sample_white_gradient and sample_gradient. I disagree with @earwickerh, who said that "sampleGradient … is the raw spectral data from the SCIO's internal white reference". White reference is stored in sample_white. According to log files (presented in the 01_rawdata/log_files/ folder), there is a special parameter called isDisableGradientSampling. I can assume that getting gradient samples is some kind of extra option and it is not obligatory. But it is only my guess.

To summarize, decompressing and decoding the raw data is not enough to get a spectrum. A mathematical model should be created to transform an image into a spectrum. Unfortunately, I cannot help with extracting an image from the raw data; it is beyond my qualifications. Is it ever possible? Nevertheless, I hope that someday I will participate in creating a mathematical model for this project.

sample, sampleDark and sampleGradient

Nice to see you're still tinkering with this. In the readme it says: "Every SCIO bluetooth LE message contains 3 parts: sample, sampleDark and sampleGradient (No clue so far what that those mean or how to convert them)." Not sure if that's up to date, but hope the below is helpful.

sample: This is the raw spectral data from the sample. It represents the light that is reflected off the sample and detected by the SCIO.

sampleDark: This is the raw spectral data from the SCIO's internal dark current reference. It represents the background signal that is detected when there is no light present.

sampleGradient: This is the raw spectral data from the SCIO's internal white reference. It represents the signal detected when the SCIO is measuring a known white reference

To calculate the reflectance values of the sample, you need to subtract the sampleDark data from the sample data and divide the result by the difference between the sampleGradient data and the sampleDark data. This is expressed by the equation R = (S - D) / (G - D), where R is the reflectance value, S is the sample data, D is the sampleDark data, and G is the sampleGradient data.

Let's take -bark.txt sample:

  Header: 01 ba 02 90 01 8f 1c 07 34 02 00 02 8e
  Sample: 06 36 7b 2e 4f 3d 1c 0e 06 04 04 06 12 ...
  SampleDark: 05 34 75 26 48 36 1c 0c 05 04 05 05 0d ...
  SampleGradient: 0b 4f 9c 3e 6c 4c 28 11 0a 09 0a 0c ...
  
Packet 2:
  Header: 02 ba 02 90 01 8f 1c 07 34 02 00 02 8f
  Sample: 06 37 7c 2f 4f 3e 1c 0d 07 04 04 05 12 ...
  SampleDark: 05 34 75 26 48 36 1c 0c 05 04 05 05 0d ...
  SampleGradient: 0b 4f 9c 3e 6c 4c 28 11 0a 09 0a 0c ...

Each packet consists of a header and three data sections: sample, sampleDark, and sampleGradient. The header contains information about the packet, including a packet identifier (01 or 02), the protocol identifier (ba), and the length of the data sections (in this case, 02 90).The sample, sampleDark, and sampleGradient data sections are each 400 bytes long and contain spectral data measured by the SCIO spectrometer.

To calculate reflectance values from the sample, sampleDark, and sampleGradient data, you need to perform the following steps:

Subtract the sampleDark data from the sample data to obtain the corrected sample signal.
Subtract the sampleDark data from the sampleGradient data to obtain the corrected gradient signal.
Divide the corrected sample signal by the corrected gradient signal to obtain the reflectance values.

The example code below should extract the data from the log, converts these arrays to numpy arrays of integers and performs the reflectance calculation using numpy array operations.

import numpy as np

# Load the raw data from the log file
with open("log_20200604-bark.txt", "r") as f:
    data = f.readlines()

# Parse the data into packets
packets = []
for line in data:
    if line.startswith("Packet"):
        packets.append(line.strip().split(": ")[1])

# Extract the sample, sampleDark, and sampleGradient data from the packets
header = packets[0].split(" ")[2:]
sample = packets[1].split(" ")[1:]
sampleDark = packets[2].split(" ")[1:]
sampleGradient = packets[3].split(" ")[1:]

# Convert the data from hex strings to numpy arrays of integers
sample = np.array([int(x, 16) for x in sample])
sampleDark = np.array([int(x, 16) for x in sampleDark])
sampleGradient = np.array([int(x, 16) for x in sampleGradient])

# Perform the reflectance calculation
correctedSample = sample - sampleDark
correctedGradient = sampleGradient - sampleDark
reflectance = correctedSample / correctedGradient

# Print the first 10 reflectance values
print(reflectance[:10])

The reflectance values will be in units of "counts per second"...

researcher license

I have a researcher license and can download spectrum. How can i sent it to you?

How connect scio scanner ?

I have a SCIO scnner without a developer license, I'm following your project I launched "python read_spectrum.py" getting errors in file1.txt!
is there a guide on how to make SCIO work with your project?
thank you!

An issue with base64 representation

I came across this project yesterday. I appreciate that you have started it! Before participating in the discussion, I want to play with the data a little. But I failed to read the base64 representation of raw data. It would seem that there is a flaw in the encoding.

Example 1 - 01_rawdata/scan_json/scan_dark1.json
The scan.sample value in the base64 format is not readable either in R or in any web application (e.g., https://base64.guru/converter/decode/hex).
This is my R code:

# 'scan.sample' from 'scan_dark1.json'
sample_enc <- "AAAAAOENdN5rhhIMjW6NSelqd0UdBZB1EQTFEwegFbU_hRN4FNk1t-ak93dD1BexDYHwDZqlJ0xCekuaaykdmfwdbofnmLQ3Xn5enVY-peUCkqZGDnsJJ63_vW4Q4VnJm9ujqz99m_Q1GSLxxm9vb4B2yKFV7Wc_0UomRILwk_oAoD9afPn6pKS-i7bPs2O9QRKUdhv2EEwPCguv4JN_4t62DDYRY9ZHUHwkXH2e0i7iPOws-HZ7YBwa4BiqlGNZ7TmTFhMuWDP1C7u2EOAXpEUN1oXMqO4n_cUhwNTRZjDPEHP6AvPBLZWAOkzPNJ8mr0FPO6c6sVm-tbuyd57_KyUl5f1AgqPUSJ_9S5OzcfvvorcpAO88Jo9SJwuRRYrM7UJSJRnpJP612ICQYI5Nv98fkOuqj2pAf6-myKzQ2C88uH0RC9eyTFXcuuzDtprfEXRXcK5DWRN05CZ1A4Ax-_MYvAHuv-t6R9KPXKcuKVb9y_NAP7FKbtyzyBc_JJX64N8PotU8qtdEda083_FHpxsW_U1ZnMYFhyG0XPDIS6mt23lKIsVbp7xZNFOSlAn7Qctu6gAStKQnhUnifHh4ySObVxjbrDJ3egwKkWG9eaTatb9DeebyRR7JWU0HwYEHDSu9KR94pSL7FW94O_tem7wn72fE05JdNplp6ziFV0AEkbkvDhGQj8CwgylieRBDr7V1IUnI07UV1_WlJj7AkwR27Dej4IhPss4LA_J_w88x0E_qVAtvYIEhslQkaOMJphioMc31hobIYVVLvM-C9naUwQVUyqkJbK-A5cD_d607EJgwRFeFw7urzoa3SULKU8kOPRi3gHoE_0-sEYelLqcYY-w5CSEPKU1gQzDmDilNlhFL6ClISbKv_78aK_KCw1BtYbS6ftah032AmfRlFhtiqj88CFqgAuPJybSnssSutxqxMx7W-Lsdwd7kKyc4cHAxD_eFpR3_9fIZ5ncQc0B5ceGREgJVoGlWDJWtGOJmCv55h3cqRphbAYRppniXJShGQZ0-Lf_WaiUQs9oGCKQdy5KabaKLTE7LDXPxlk8x5vCzyVwuZF3Ds-p7PGHKhaqXUhK-C5PChaNSg7RloYjLH2UQvy_eLqFYDRe9tDSiDnnT-5vSTE6t3rDp6xWEabfzbirQiCI_M6_rslws62by5JQ31pAiNrxhvWWNw3Zr0lV_og2csSRcD8PBVLvJpH9jdeE4JEXg8wyFGY8TegZaOaN1qKq3f5NfejcMKex3L3p9Iu4FouxlTGeJnuz7Prg6e9zwzLTW0E6-cQsgNompVAUzX-vaeg_zyFtnK5ADQc9jWmlFCq6Lcrqyba8hkOL21ikHsMfyuEqv4y6k3_-wRDda__W5AGNVOl1fCqMgLQ26bAtx6SSwQ7Fjbk0cZGKrz7RmA8jmD4uCk2QpgR5WKg2TqiwwFC5m491iZjbn3BKL-Sxf5ZoQNys96wHBmHBQ1AXGDSkEiVh12Nq4NP2CNXTvaHXnCsq8eR3FtrVa-wkVDXDLJb6EcHTrI2BmZWYEA_iWuPwKkooDT2gCZkbEl_ZEjcqr9KZGdypcqAmWU3VW2niiNoOrNONDeHPLbNrTCdg5WWywZb3C6WVUHeFqKWpV8iHbvkvGsLARUSHGQKA178X2OMAJy4tQjcw1Q9jjGpcM5slm5-zzC72b71Znb6uvqhz3s3SbOnU6Ql1QiKyNVa0OQlbbOIDihdOFW4W9pdPEQGClgzco06Wu8tlL14KdePe5WP3myVUeDO-sP7YfiKefYBWjwE4hr6sKajLP7lBkUEMzClu6BE_agrdK1VYhtkuGdyuKHj-BuTSRvhhoa52ugBOAJT3b0W2UaimJb2zRDyTr9p5moen3IsI3iHuIAfOdgSinH7B73coqYHTYSydRtoAkOIY1tR-U62_AVjLJ0flnggJA9EctYeFpTIjstXRugvDtvS1YA7YGnyx-6oX1YLLXnWib1fDTCG1_UFlXrRODb2Glx5IHFlWGP2LMCT29JIM4xlgNn4JqrN1InioXodrHgBR7dc5TCm8np3_gRxVvUcy8c0qJxCgNmzEBlapZh4xX_5DBR9zMEAAas8WbSJxpbf7OU7tC9Ks_jOByVHSJynKgK0P4hiYE7GdFO5fVCV4Dp2iMuq-uopvAdJ2e4RhupUqVe_iMx06gZL1ySvT9fEIbBfP-o6YzV-VfnSwHjolVIxmoZDZ-DIxnOsBePYXRonwfViqYERvecBAMf0dhSiftl9yNb1Y5mYExdffCo4I3G5YZbxo3gT_j9l9nEwDguqofs-hfb4dD-DzWQGmAi95JuWdzgmkfeljCOdDlRqQx68DkJAfsWGrfDGycTbTh8QB5Z_dM6Bv05DsXBYimNiEZuFg8_2AURtJnUL5M"


# For the 'jsonlite' package, decoding failed and an error appeared.
sample_dec_1 <- jsonlite::base64_dec(sample_enc)
#> Error in jsonlite::base64_dec(sample_enc) : Error in base64 decode


# For the 'base64enc' package, the string was decoded. But after encoding two
# lines are not identical.
sample_dec_2 <- base64enc::base64decode(sample_enc)
length(sample_dec_2)
#> [1] 1740
identical(sample_enc, base64enc::base64encode(sample_dec_2))
#> [1] FALSE


# For the 'openssl' package, decoding failed and the output vector is empty.
sample_dec_3 <- openssl::base64_decode(sample_enc)
length(sample_dec_3)
#> [1] 0

Example 2 - 01_rawdata/log_extracted/scan_from_log_20211020_105957.json
Both b64_data.sample and b64_data.sample_gradient converted into hex match raw_data.sample and raw_data.sample_gradient respectively. However, b64_data.sample_dark converted into hex does not match raw_data.sample_dark. As before, I decoded base64 using both several R packages and a web application.

I have a couple of requests:

  1. Please, keep the hex representation in JSON files.
  2. Add hex representation into files from the 01_rawdata/scan_json/ folder.
  3. The more files we have, the better. I would be grateful if you could upload some more examples containing both the raw data and the spectra returned by the server (01_rawdata/log_extracted/).

Thank you in advance!

Offer to help

Apologies if 'New Issue' is not the preferred communication strategy; but wanted to share that I have one of these devices. I came across this repo when trying to figure out how to make sense of the product and its data. In any case, I don't have the technical expertise to crack/decode (but thank you all for trying). If there's anything you think I might be able to contribute, in having another device on hand, please let me know.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.