The following proposed changes to fideslang follow mo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

as I understand the new changes, user.account and <co

That's helpful <a class="user-mention notranslate" data-hovercard-type="user" data-hov

DataUse & DataCategory Simplifications about fideslang HOT 15 CLOSED

iabtechlab commented on May 29, 2024

DataUse & DataCategory Simplifications

from fideslang.

Comments (15)

NevilleS commented on May 29, 2024 1

I like that suggestion, to add user.account.address and user.contact.address as subcategories, so we have those for city/street/etc.

from fideslang.

brentonmallen1 commented on May 29, 2024 1

I forgot there was already a user.location

from fideslang.

SteveDMurphy commented on May 29, 2024

@mfbrown @NevilleS - @brentonmallen1 suggested adding another layer for address prior to the more detailed values (i.e. street

So instead of user.account.street we would have something like user.account.address.street

Any thoughts here? I can include these changes as well if we want to go that route 👍🏽

The only issue I can think of off the top of my head would be if there was a separate use for something like city or state that didn't align with address somehow?

from fideslang.

brentonmallen1 commented on May 29, 2024

A potential means to abstract that out a bit further would be using “location” like `user.account.location.city ` or `user.account.location.address` along with `user.account.location.city`

…

On Tue, Jul 5, 2022 at 3:03 PM Steve Murphy ***@***.***> wrote: @mfbrown <https://github.com/mfbrown> @NevilleS <https://github.com/NevilleS> - @brentonmallen1 <https://github.com/brentonmallen1> suggested adding another layer for address prior to the more detailed values (i.e. street So instead of user.account.street we would have something like user.account.address.street Any thoughts here? I can include these changes as well if we want to go that route 👍🏽 The only issue I can think of off the top of my head would be if there was a separate use for something like city or state that didn't align with address somehow? — Reply to this email directly, view it on GitHub <https://github.com/ethyca/fideslang/issues/58#issuecomment-1175401259>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZEH3XC65NPAONUFQ5TJWDVSSBGJANCNFSM52FWUY5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from fideslang.

NevilleS commented on May 29, 2024

Let's do it. I think it's a good improvement and all those address fields (city/street/postal code/etc) were really looking for a home

from fideslang.

cilliankieran commented on May 29, 2024

As part of a large(ish) dataset labeling exercise I've come across the same questions in the last 24 hours.

To list some observations from that dataset labeling exercise and some comments on what's said here:

contact today encapsulates address labels as well as email and phone_number
The first issue with this is that if you have a field that is an entire address i.e. street, city, state, zip, country, you can’t label it correctly - contact isn’t semantically appropriate and it also encapsulates email and phone.
We should introduce user.contact.address and move phone and email up to user.contact
We are also missing building, suite or apartment # equivalent. So basically some version of unit_number or unit (not very intuitive?)
On your point of the edge cases of city and state not part of an address. I would say they should be labeled user.location , however our current use of location, in this context is actually for lat/long data so not sure this is right. Perhaps it should be location with optional sub categories for location.city, location.state and location.coordinates?
A final question though, what's the thinking in having user.account.address and user.contact.address - do we not risk complicating the understanding for manual labeling?

from fideslang.

brentonmallen1 commented on May 29, 2024

as I understand the new changes, user.account and user.contact are branches that will remain. Currently, they both have repeating categories - i.e. user.account.contact.postal_code and user.contact.postal_code - as a remnant of removing the provided/derived and identifiable/non-identifiable paths.

from fideslang.

cilliankieran commented on May 29, 2024

Yes, I think the wrinkle is probably from moving account --> user.account, as in the prior structure account represented something not owned by a user directly (pretty distinct) and I'm wondering if nesting these is now creating a doubling of these branches unnecessarily. I'm going to go back and look at the ISO spec and get back on this later today...

from fideslang.

brentonmallen1 commented on May 29, 2024

just for some perspective on how these choices can impact things in a practical example, here's a label mapping I'm attempting to do for a separate effort.

https://docs.google.com/spreadsheets/d/1myvSpNPXT5U78B5XZb5p9mFAu1sIJqwsMs6vW36xkE8/edit#gid=0

from fideslang.

cilliankieran commented on May 29, 2024

That's helpful @brentonmallen1, I think it's worth noting this decision impacts both the classifier work so how you're thinking about it here and also the manual labeling and cognitive load expected of any dev trying to manually annotate something.

As a reminder for all, these are the broad distinctions (based on ISO 19944) of the difference between account and user:

Account data
A class of data specific to each customer of the service that is required to sign up for, purchase or administer the service. This data includes information such as names, addresses, payment information, etc. Account data is generally under the control of the service provider.

User data
This includes content directly created by users and all data, including all text, sound, software or image files that customers provide to the service, or are provided to the cloud service on behalf of customers, through the capabilities of the service or application. This includes directly provided or derived data.

I understand why we agreed to move account under user, I just want to sanity check if we understand the impact in thinking for a developer using the tool - I would have to understand the distinction between a users contact information and between a users account related contact information.

Fwiw, to double check myself on this I've asked Damien (aka my evil twin brother, Chief Privacy Officer at Twitter and advisor to Ethyca) to play out a thought experiment and answer for us whether the distinction of account related to any specific single element of personal data would matter either in policy enforcement, risk evaluation or mapping. He said he'd get back to me on this over night....

from fideslang.

NevilleS commented on May 29, 2024

Hey all - I followed up with @cilliankieran separately as well and we agree with the general recommendation here: let's remove account entirely. This removes the duplication of ...email as a subcategory and will force us to declare "Account" data with a separate dimension like data_use, or a different kind of grammar like an "attribute", etc.

So to be laser-clear, I think the list of changes here would be:

Remove identifiable, nonidentifiable, derived, and provided subcategories
- Remove the parent categories, combining the subcategories into one
Rename provide.system to provide.service
Remove account entirely, and all subcategories (e.g. account.contact)
Add user.contact.address as a parent category for the address-related categories like ...city, ...street, etc.

Do those four points sound right @SteveDMurphy ?

from fideslang.

SteveDMurphy commented on May 29, 2024

All good to me @NevilleS, it certainly feels like this better achieves the goal of reducing hesitancy when annotating a dataset while providing more value as described by @brentonmallen1 - thanks for driving this to a conclusion and for the thoughts and feedback from everyone!

from fideslang.

NevilleS commented on May 29, 2024

ship it!

from fideslang.

brentonmallen1 commented on May 29, 2024

would location be a more generic parent than address?

from fideslang.

SteveDMurphy commented on May 29, 2024

would location be a more generic parent than address?

I feel like your original suggestion of address is pretty comprehensive for the purposes here (street, city, etc.) I think I remember from a live meeting someone mentioning location was generally more in line with point or coordinate data as well

from fideslang.

DataUse & DataCategory Simplifications about fideslang HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent