Giter VIP home page Giter VIP logo

Comments (8)

westontrillium avatar westontrillium commented on June 20, 2024 2

@isabelle-dr Interesting data! Is that based on feeds where those fields are defined, not just present as an empty column?

I agree that we should think about how we justify raising this specific field to "recommended" status before other fields/features that are equally, if not more, universally useful. We should also keep in mind that to be a "recommended" component is essentially saying that the best practice is to include it and any feed that does not, is therefore not aligned with best practices, something that can have implications in contractual obligations of vendors to provide data of a certain quality ("...shall provide GTFS in accordance with best practices...") and regional regulatory requirements placed upon agencies.

I like the idea of encouraging this field's use. Maybe I'm making mountains out of molehills, but one issue I see in designating it as a recommended field is the fact that it includes a "no information" option (empty/0). It's unclear what would be gained by introducing this recommendation since there is always going to be the possibility that the exclusion of a 1 or 2 value in that column is justified and intentional because the information is genuinely unknown or unreliable enough to not outrightly express allowance/disallowance of bikes. Is limiting it to "recommended if explicitly allowed/disallowed" enough of a mitigation? How does one independently verify with confidence that "explicit allowance/disallowance of bikes" is not applicable for the trip, and therefore best practices are not violated with absence of 1 or 2? At the very least, a validation tool couldn't do itβ€”it would have to just give a warning that the feed may not follow best practices for every instance that the field is empty/not present.

from transit.

gcamp avatar gcamp commented on June 20, 2024 1

I agree in principle that bikes_allowed should be recommended, but I find it strange that that field get that treatment before other generally useful optional fields (like trip/stop_headsign or route_color).

Maybe there should be a gradation of recommendation also, to differentiate specific use case (bike trip planning) and more general optional fields that are used in every use case. Maybe "Recommended" and "Suggested"?

from transit.

isabelle-dr avatar isabelle-dr commented on June 20, 2024 1

Is that based on feeds where those fields are defined, not just present as an empty column?

@westontrillium correct. Our current logic is: the column is present and there are values defined for at least one record (having values for certain records but not all of them seems very rare so we kept it simple).

from transit.

isabelle-dr avatar isabelle-dr commented on June 20, 2024 1

Thank you for providing feedback on this issue!

This issue made me want to look at what are the most common optional features on the Mobility Database: ~1500 GTFS Static datasets across 79 countries - acknowledging that the US accounts for approx half of this right now, so I also check specifically for non-US based data. These are the 6 top represented optional features, in both cases.

feature all data non-US data
Shapes 84% 72%
Headsigns 82% 79%
Route Colors 73% 57%
Wheelchair Accessibility 62% 41%
Feed Information 62% 43%
Location Types 60% 46%

We believe adding recommendations based on what we see in practice is in line with the spirit of the GTFS Best Practices (cc @antrim for more context if needed), so I'm thinking of opening two new issues to make the Route Colors and Headsigns fields recommended. Would you support this?

from transit.

markstos avatar markstos commented on June 20, 2024

Is there a way to review how many GTFS feeds are providing "bikes allowed" in practice currently? If it's a rarely used field, it could be annoying for for operators across the world to update their feeds to add "bikes_allowed:false".

If it's already a popular field, it seems like a clear win.

If there a bunch of operators that do allow bikes in practice but don't reflect that in their GTFS feeds, then I think it could also be a win if pushes a number of operators to have more accurate and complete feeds.

from transit.

isabelle-dr avatar isabelle-dr commented on June 20, 2024

@markstos good question.

We have 367 datasets that have bike_allowed on the Mobility Database, over 1527 GTFS Schedule datasets -> 25%.

route_color is at 73%, headsigns at 82%

from transit.

bdferris-v2 avatar bdferris-v2 commented on June 20, 2024

In light of MobilityData/GTFS_Schedule_Best-Practices#56, I don't suppose I can argue against making the field recommended πŸ˜‡ But I do hear arguments that there may be other fields that might take higher priority.

from transit.

isabelle-dr avatar isabelle-dr commented on June 20, 2024

Now, to answer @westontrillium's good points:

the effect on contractual obligations ("...shall provide GTFS in accordance with best practices...")

I think this is a desired effect for doing this type of change, no? We could potentially use the Canonical GTFS Schedule Validator's versioning system which has a release when updated with spec evolution. This way, the requirements of a contract wouldn't change if GTFS evolves (but again maybe this is what we want?).

How does one independently verify with confidence that "explicit allowance/disallowance of bikes" is not applicable for the trip, and therefore best practices are not violated with absence of 1 or 2?

I think we can split quality evaluation into:

  1. the data is modeled properly (i. e. the data can be re-used confidently, no bus is going back in time, foreign IDs exist, the station hierarchy makes sense, etc).
  2. the data is comprehensive (i. e. it contains fields/values that improve what riders are seeing).
  3. the data is aligned with the real world (e. g. the stop is actually at this lat/lon, the fare is accurate, bike allowance info is, in fact, not available yet for this agency, etc).

The Spec and Best Practices play a role in 1 and 2, and these can mostly be checked automatically. For 2, in our validator, there is metadata (the dataset contains Fares), or warnings for fields recommended explicitly (missing_recommended_file for feed_info.txt).

Verifying 3 is along the lines of the Grading Scheme.
I think the verification that the dataset accurately represents info accessible in the real world should be done outside the spec, I'd avoid statements such as: "it is recommended to add values 1 and 2 to bike_allowed if the info is available".

from transit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.