Giter VIP home page Giter VIP logo

aind-data-schema-models's People

Contributors

github-actions[bot] avatar helen-m-lin avatar jtyoung84 avatar sun-flow avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aind-data-schema-models's Issues

Schemas are brittle with respect to manufacturers

User story

As a scientists, we may want to use a new piece of hardware. Right now, if that hardware is from a new manufacturer that is not listed in aind_data_schema_models.Organizations then we would need to use Other, or submit a PR to this repo, and then bump all of the schemas we are using. With complex schemas, this could require significant updates. Am I missing something? Is there a better way to handle this issue?

Acceptance criteria

  • This is something that can be verified to show that this user story is satisfied.

Sprint Ready Checklist

  • 1. Acceptance criteria defined
  • 2. Team understands acceptance criteria
  • 3. Team has defined solution / steps to satisfy acceptance criteria
  • 4. Acceptance criteria is verifiable / testable
  • 5. External / 3rd Party dependencies identified
  • 6. Ticket is prioritized and sized

Notes

Add any helpful notes here.

Import enums from aind_data_schema.data_description into this package

Is your feature request related to a problem? Please describe.
I'd like to import a few data models into packages other than aind-data-schema. It'd be nice if they were in this repo instead of aind-data-schema.

Describe the solution you'd like
Import this stuff here:

class RegexParts(str, Enum):
    """regular expression components to be re-used elsewhere"""

    DATE = r"\d{4}-\d{2}-\d{2}"
    TIME = r"\d{2}-\d{2}-\d{2}"


class DataRegex(str, Enum):
    """regular expression patterns for different kinds of data and their properties"""

    DATA = f"^(?P<label>.+?)_(?P<c_date>{RegexParts.DATE.value})_(?P<c_time>{RegexParts.TIME.value})$"
    RAW = (
        f"^(?P<platform_abbreviation>.+?)_(?P<subject_id>.+?)_(?P<c_date>{RegexParts.DATE.value})_(?P<c_time>"
        f"{RegexParts.TIME.value})$"
    )
    DERIVED = (
        f"^(?P<input>.+?_{RegexParts.DATE.value}_{RegexParts.TIME.value})_(?P<process_name>.+?)_(?P<c_date>"
        f"{RegexParts.DATE.value})_(?P<c_time>{RegexParts.TIME.value})"
    )
    ANALYZED = (
        f"^(?P<project_abbreviation>.+?)_(?P<analysis_name>.+?)_(?P<c_date>"
        f"{RegexParts.DATE.value})_(?P<c_time>{RegexParts.TIME.value})$"
    )
    NO_UNDERSCORES = "^[^_]+$"
    NO_SPECIAL_CHARS = '^[^<>:;"/|? \\_]+$'
    NO_SPECIAL_CHARS_EXCEPT_SPACE = '^[^<>:;"/|?\\_]+$'


class DataLevel(str, Enum):
    """Data level name"""

    DERIVED = "derived"
    RAW = "raw"
    SIMULATED = "simulated"


class Group(str, Enum):
    """Data collection group name"""

    BEHAVIOR = "behavior"
    EPHYS = "ephys"
    MSMA = "MSMA"
    OPHYS = "ophys"


def datetime_to_name_string(dt):
    """Take a date and time object, format it a as string"""
    return dt.strftime("%Y-%m-%d_%H-%M-%S")


def datetime_from_name_string(d, t):
    """Take date and time strings, generate date and time objects"""
    d = datetime.strptime(d, "%Y-%m-%d").date()
    t = datetime.strptime(t, "%H-%M-%S").time()
    return datetime.combine(d, t)


def build_data_name(label, creation_datetime):
    """Construct a valid data description name"""
    dt_str = datetime_to_name_string(creation_datetime)
    return f"{label}_{dt_str}"

Describe alternatives you've considered
Importing from aind-data-schema, but that requires importing all the other dependencies.

Additional context
Add any other context or screenshots about the feature request here.

Publish schemas to s3

Is your feature request related to a problem? Please describe.
As a user, I'd like to pull the data models from a database

Describe the solution you'd like
Push the schemas to s3 first

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Missing fields

Describe the bug

  1. Missing Fujinon in LENS_MANUFACTURERS (organizations.py)

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

DataRegex for special chars are not working for `/`

Describe the bug

  • DataRegex.NO_SPECIAL_CHARS and DataRegex.NO_SPECIAL_CHARS_EXCEPT_SPACE are not working for / chars.
  • The result is that fields that should not allow / are allowing it without raising validation errors.
  • This is causing validation errors in the Metadata data entry app with: Invalid regular expression: /^[^<>:;"/|?\_]+$/u: Invalid escape

To Reproduce
Steps to reproduce the behavior:

  1. Create a pydantic model that uses DataRegex.NO_SPECIAL_CHARS or DataRegex.NO_SPECIAL_CHARS_EXCEPT_SPACE for the pattern of a field.
  2. Try adding a string that has / in that field.
  3. Observe that validation errors are not showing up, meaning that / is being allowed.

Alternatively, validate any data_description json using the Metadata entry app, and observe the error:
image

Expected behavior
The DataRegex class should have enums with the correct regex to match all special characters.

Additional context
Add any other context about the problem here.

NCBI taxonomy ids

the registry_id for our species is currently only the number (e.g. 10090 for mouse). The correct ID is NCBI:txid10090

  • Fix the registry_ids for the species in our models.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.