This is somewhat related to #42. To confirm my understanding of the current method:
Total Vehicles = Count of unique device_id by day from Device_Status with an event_type of reserved, available, unavailable at least once in that day
Active Vehicles = Count of unique device_id by day from Trips
Is that right? In the interest of comparing notes, SFMTA has taken a few approaches to counting devices thus far, depending on the purpose, and they all each have their pros & cons. We initially started with a simple count of unique device_ids for each day but since providers may swap out vehicles mid-day, this ended up being misleading.
Hourly snapshots on-street devices = Given a "snapshot date/time" (e.g., 4/25 at 8am), total count of unique device_id from Device_Status that have sent an event within 48 hours of the snapshot date/time and with a latest event of available, reserved, or unavailable
Hourly snapshots of revenue devices = Given a "snapshot date/time" (e.g., 4/25 at 8am), total count of unique device_id from Device_Status that have sent an event within 48 hours of the snapshot date/time and with a latest event of available or reserved
The only difference between the two are whether or not to include the unavailable devices. If measuring towards cap adherence, then we include the unavailable devices (what we've been calling "on-street devices"), but if looking more at actual service and what's available to customers (what we've been calling "revenue devices"), we exclude the unavailable devices. The 48-hour window can also be flexible.
We've also been calculating:
Revenue Hours = sum of total time in an available or reserved state, which is derived from Device_Status
Using the hours helps to account for the change in devices over the course of the day. Given different service models, this may or may not matter. Some providers pull devices from service at night, some rebalance and/or replace devices over the course of the day, and others may just leave devices out all day with minimal rebalancing. The hours would account for this so that we can compare and/or aggregate providers in an apples-to-apples way.
The snapshot approach is a little more intuitive, but the revenue hours is a little more flexible. We're currently working with both now.