Comments (15)
I like that suggestion, to add user.account.address
and user.contact.address
as subcategories, so we have those for city/street/etc.
from fideslang.
I forgot there was already a user.location
from fideslang.
@mfbrown @NevilleS - @brentonmallen1 suggested adding another layer for address
prior to the more detailed values (i.e. street
So instead of user.account.street
we would have something like user.account.address.street
Any thoughts here? I can include these changes as well if we want to go that route 👍🏽
The only issue I can think of off the top of my head would be if there was a separate use for something like city
or state
that didn't align with address
somehow?
from fideslang.
from fideslang.
Let's do it. I think it's a good improvement and all those address fields (city/street/postal code/etc) were really looking for a home
from fideslang.
As part of a large(ish) dataset labeling exercise I've come across the same questions in the last 24 hours.
To list some observations from that dataset labeling exercise and some comments on what's said here:
contact
today encapsulates address labels as well asemail
andphone_number
- The first issue with this is that if you have a field that is an entire address i.e.
street
,city
,state
,zip
,country
, you can’t label it correctly -contact
isn’t semantically appropriate and it also encapsulatesemail
andphone
. - We should introduce
user.contact.address
and movephone
andemail
up touser.contact
- We are also missing building, suite or apartment # equivalent. So basically some version of
unit_number
orunit
(not very intuitive?) - On your point of the edge cases of
city
andstate
not part of an address. I would say they should be labeled user.location , however our current use of location, in this context is actually for lat/long data so not sure this is right. Perhaps it should belocation
with optional sub categories forlocation.city
,location.state
andlocation.coordinates
? - A final question though, what's the thinking in having
user.account.address
anduser.contact.address
- do we not risk complicating the understanding for manual labeling?
from fideslang.
as I understand the new changes, user.account
and user.contact
are branches that will remain. Currently, they both have repeating categories - i.e. user.account.contact.postal_code
and user.contact.postal_code
- as a remnant of removing the provided/derived
and identifiable/non-identifiable
paths.
from fideslang.
Yes, I think the wrinkle is probably from moving account
--> user.account
, as in the prior structure account represented something not owned by a user directly (pretty distinct) and I'm wondering if nesting these is now creating a doubling of these branches unnecessarily. I'm going to go back and look at the ISO spec and get back on this later today...
from fideslang.
just for some perspective on how these choices can impact things in a practical example, here's a label mapping I'm attempting to do for a separate effort.
https://docs.google.com/spreadsheets/d/1myvSpNPXT5U78B5XZb5p9mFAu1sIJqwsMs6vW36xkE8/edit#gid=0
from fideslang.
That's helpful @brentonmallen1, I think it's worth noting this decision impacts both the classifier work so how you're thinking about it here and also the manual labeling and cognitive load expected of any dev trying to manually annotate something.
As a reminder for all, these are the broad distinctions (based on ISO 19944) of the difference between account
and user
:
Account data
A class of data specific to each customer of the service that is required to sign up for, purchase or administer the service. This data includes information such as names, addresses, payment information, etc. Account data is generally under the control of the service provider.
User data
This includes content directly created by users and all data, including all text, sound, software or image files that customers provide to the service, or are provided to the cloud service on behalf of customers, through the capabilities of the service or application. This includes directly provided or derived data.
I understand why we agreed to move account
under user
, I just want to sanity check if we understand the impact in thinking for a developer using the tool - I would have to understand the distinction between a users contact information and between a users account related contact information.
Fwiw, to double check myself on this I've asked Damien (aka my evil twin brother, Chief Privacy Officer at Twitter and advisor to Ethyca) to play out a thought experiment and answer for us whether the distinction of account related to any specific single element of personal data would matter either in policy enforcement, risk evaluation or mapping. He said he'd get back to me on this over night....
from fideslang.
Hey all - I followed up with @cilliankieran separately as well and we agree with the general recommendation here: let's remove account
entirely. This removes the duplication of ...email
as a subcategory and will force us to declare "Account" data with a separate dimension like data_use
, or a different kind of grammar like an "attribute", etc.
So to be laser-clear, I think the list of changes here would be:
- Remove
identifiable
,nonidentifiable
,derived
, andprovided
subcategories- Remove the parent categories, combining the subcategories into one
- Rename
provide.system
toprovide.service
- Remove
account
entirely, and all subcategories (e.g.account.contact
) - Add
user.contact.address
as a parent category for the address-related categories like...city
,...street
, etc.
Do those four points sound right @SteveDMurphy ?
from fideslang.
All good to me @NevilleS, it certainly feels like this better achieves the goal of reducing hesitancy when annotating a dataset while providing more value as described by @brentonmallen1 - thanks for driving this to a conclusion and for the thoughts and feedback from everyone!
from fideslang.
ship it!
from fideslang.
would location
be a more generic parent than address
?
from fideslang.
would
location
be a more generic parent thanaddress
?
I feel like your original suggestion of address
is pretty comprehensive for the purposes here (street
, city
, etc.) I think I remember from a live meeting someone mentioning location
was generally more in line with point or coordinate data as well
from fideslang.
Related Issues (20)
- Consider updating type annotation of `meta` attribute to be more permissive HOT 2
- Duplicate records in .fides/db_dataset.yml HOT 4
- [Backend] Add Optional PrivacyDeclaration.cookies field HOT 1
- Taxonomy visualizer image not showing HOT 2
- Unpin to pydantic < 1.10.0 HOT 4
- Consolidate build/tool configuration into `pyproject.toml` HOT 1
- Build default taxonomy from CSV HOT 2
- The `meta` field validation is too loose HOT 1
- [Epic] Add the ability to gracefully deprecate parts of the default taxonomy HOT 2
- An update to the key finder function appears to have broken something when trying to upgrade in `Fides` HOT 1
- issues with mypy typing when importing `fideslang` HOT 5
- Deprecate legal_basis , legitimate_interest, and legitimate_interest_impact_assessment in taxonomy HOT 4
- Update fideslang data uses to support GVL HOT 1
- Update fideslang data categories to support GVL HOT 2
- Deprecate Similar Fields on Collection and Field
- Overhaul docs for Fideslang 2.0
- Update to Pydantic 2.X
- Deprecate Registries HOT 1
- Add some missing TCF-related fields to `System` model HOT 1
- add `flexible_legal_basis_for_processing` field HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fideslang.