Giter VIP home page Giter VIP logo

goggles-quickstart's Introduction

Brave Search Goggles

Search · Open Ranking · Algorithmic Transparency
Try it · Discover · Whitepaper


Goggles enable anyone, be it individuals or a community, to alter the ranking of Brave search by using a set of instructions (rules and filters). Anyone can create, apply, or extend a Goggle. Essentially Goggles act as a custom re-ranking on top of Brave’s search index.

This means that, instead of a single ranking, Brave Search can offer an almost limitless number of ranking options. While Brave Search doesn't have editorial biases, all search engines have some level of intrinsic bias due to underlying data (the Web) and some algorithmic choices of features to select. Goggles allows for users to counter such intrinsic biases in the ranking, and also to create custom search experiences that couldn’t be covered by an all-purpose search engine such as Brave Search.

Goggles are owned solely by their creators and, if public, can be used by anyone on top of Brave Search index.

Want to start building your own goggles, or fork or extend existing ones? Check out our Goggles quickstart guide.

Want to learn more about the motivation for Goggles? Check out the Goggles whitepaper.

Privacy considerations

To apply a Goggle, a user needs to pass the Goggle URL together with a query. We treat Goggle URLs with the same strict provisions we use to handle other data elements (like IP or geo-coordinates) that could serve as identifiers. Brave does not track the queries of any of its users, and searching with Goggles is no exception. However, note that if a Goggle is used only by one, or a very small number of people, the Goggle URL could serve as an identifier, and enable the creation of a profile of the user’s queries while using that particular Goggle.

Learn more about our privacy policy.

goggles-quickstart's People

Contributors

remusao avatar thypon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

goggles-quickstart's Issues

Cannot add separate files from the same GitLab snippet

Steps to reproduce

Result

When you try to submit the second file, Brave just reloads the first one.

Expected result

Since it is possible to have multiple files in the same GitLab snippet, Brave should look at the full URL for each file instead of relying only on the snippet ID.

Real number boost amounts

Hi,
I've started using Goggles and they exceed my expectations. They really give me a way to escape what is imposed by the "one size fits all" approach. Thanks.

In applying reranking offsets I'd like to use real numbers instead of integers.
For example: "/SomeDetail$boost=3.54,"
Instead of "/SomeDetail$boost=4,"

I tried that and the syntax checker doesn't accept it.

I suggest that boost or demotion amounts of real numbers would be useful.

My use of that might not need more than 3 extra digits of precision, like 3.141.

Feature Request: Stackranking results based on recent dates/views

When there are different results displaying for the same websites, I am getting searches that display older articles at the top (1 week or more, sometimes months ago), and newer articles from the past day or so at the bottom.

ie: $site=u.today,boost=4 would show articles with the following dates (in order):

  • 3 weeks ago
  • 2 weeks ago
  • 1 day ago
  • 2 days ago

If the goggle has just one website, this wouldn't be as much of a problem. But when I have other sites with the same boost ranking, it will also show older results for them as well, so I'll have to scroll past ~8 results before getting the more recent articles from those same websites.

If there were an attribute to specify how the results for individual sites are displayed (recent date vs views/hits), it would help to more finely tune search results.

For more news-oriented goggles, results could be based on most recent dates. But it would also give the option to specify more static websites like wiki or informational sites, based on views/hits.

No avatar mark on search result

I found one case (there could be more) when avatar mark isn't show next to the search result.
Search keyword: test
One of the results was a pl.pornhub (NSFW) page, which wasn't marked.

Goggle:

$discard

! pl.(wikipedia)
|http://pl.$boost=1
|https://pl.$boost=1

Feature Requests: Goggles DSL Implementation

I’m really impressed with the new Goggles feature and believe it to be a very powerful tool to refine search results. I’ve currently working on a few right now and have published one for public use. I know Goggles are still new and not all intended features are working/implemented, but I have found two shortcomings that are hindering me from creating a more complex and refined Goggle.

This first is that I can’t specifically target the domain name when discarding websites, but rather the code looks for everything in the URL including the subdirectory. There are websites that I would like to discard that have specific wording that I would like to target, but I don’t want to remove dictionary results that also have the wording in their URL subdirectory.

For example, instead of individually discarding every website with the word “Hollywood” in its domain, I would like to be able to discard every website with that keyword in one line of code without the code also targeting the subdirectory in the URL. This way results from news outlets and dictionaries that have that same keyword in their subdirectory will remain.

Secondly, I am aware that when writing a Goggle it’s possible to have lines of code that conflict with one another. Although not intended, I understand people want their code/Goggle to work a certain way. In such cases I believe it would be nice to have a exception rule.

For example, if the code was:

$downrank=3,site=example.abc
/posts/$boost=3

There could be an exception that would apply to “example.abc” where they have a subdirectory of /posts/ (example.abc/posts/)–allowing for only those specific results to be boosted despite the website itself being downranked.

I hope my feedback helps improve this already powerful technology. And thank you for all the great innovation you’ve brought us, such as Brave Goggles–which I think have become one of my favorite features!

Limits to boost amount

I explored what values I can submit for a boost amount.

I can submit to Brave registration of a Goggle boosts of 20 and 100. They are accepted BUT the results to those two appear identical, implying that neither is processed as the value submitted. The pop-up inclusion explainer on a search page indicates that a boost of 20 or 100 was used.

In the Gearchy / Goggledy, REPL at https://gearchy.wolf.gdn/repl/, however the interface will accept boost values up to 10 but not higher.

Is 10 the actual limit?

Should I be allowed to submit Goggles with higher values?

Goggle can bypass the Safe search strict restriction

I noticed that when Goggle (example) is active, the Safe search restriction doesn't work properly.
On strict setting (without Goggle) Safe search hides the pl.pornhub result (searched by the title found in an example).
But with Goggle active, that link was still visible despite setting Safe search to strict.

What does ^ match?

I've tried to read quickstart.goggle but I'm not sure I understand what is meant by:

! The special character '^' can be used to indicate that an URL delimiter such
! as '/' or end-of-url can be matched (note: the number of carets allowed in a
! given instruction is limited):
/this/is/a/pattern^

What is an example of matched URL, and one of not matched? And how is it different from |?

`version` identifier as metadata field

! version: 1.0.0

Could be a helpful addition to the default goggle metadata fields.

Used for what?

  • To track and identify different versions of a goggle
  • Perhaps Brave Search could regularly check submitted goggle URLs, if a version change is identified, the goggle is automatically refreshed without the need of resubmitting it on every update by the user

Feature Request: Personal Goggle

Hey,
I like the ideas of goggles, but I currently find the hurdle for creating a custom search experience too high.

Proposal

A personal Goggle that resides inside the Browser state (or synced with other devices.), that can be updated without ever touching it directly and writing goggle syntax.

Example

  1. You search for something
  2. You see results from a news outlet you don't trust.
  3. You can downrank it right there next to the search result, adding a downrank entry to your personal goggle.

That's it! No need to even know about goggles.

UX

For improved UX there could be several Buttons like:
Hide example.org from my results. → add $discard,site=example.org to personal goggle.
Show more results from example.org → add $boost,site=example.org to personal goggle.
Show less results from example.org → add $downrank,site=example.org to personal goggle.
(And a lot more, but I think I made my point.)

Even further, it could suggest other discoverable goggles.

This site appears in the Copycats removal Goggle. Do you want to add that Goggle?

Benefits

  1. Easy to discover, understand and use by new or less tech-savvy users.
  2. Less friction to maintain your personal goggle, even if you are proficient.
  3. Improved privacy, as the goggle is not visible for everyone who knows the URL.

Other sites like YouTube already offer options to click on the three dots next to a video and mark it as Not interested or Don't recommend videos from this channel. But they don't allow you to see, modify, or share your profile, or enrich it with profiles/goggles from others.

Is there no wildcard for something like this?

$discard,site=pinterest.es
$discard,site=pinterest.fr
$discard,site=pinterest.ie
$discard,site=pinterest.it
$discard,site=pinterest.nz
$discard,site=pinterest.ph
$discard,site=pinterest.pt
$discard,site=pinterest.ru

Something like
$discard,site=pinterest.* or similar?

Documentation on boost/discard numerical importance

My apologies if I've missed it in the docs, but does ranking something with a higher number give it more importance? It's not clear how $boost=3 interacts with $boost=1? Will $boost=3 items show up above $boost=1 items?

No warnings for invalid syntax submissions

I noticed that invalid syntax gets silently submitted without any warnings.

Would be great to get notified about these syntax issues in submitted goggles.

This way, authors could fix them in their source files.

So, the goggle source code displayed on the brave search about page would equal the hosted code.

Here are some of the issues which don't result in warnings:

Invalid Value

Options

$discard=4,site=msn.com

https://github.com/Killthenews/brave/blob/de522bcc456646547f815a3de9ad4fcfa3494f35/news#L11-L18

Metadata

! public: foo
  • Expected behavior: Failing submission
  • Actual behavior: Gets submitted as private goggle

Illegal Range Value

Same for out of scope values for the action options:

$boost=30,site=github.com

https://github.com/9ktz/googles-technical/blob/cb75f9a9e4a9ce2f1d0a3304dbfddc57674fb96d/github.google#L8-L10

Multiple $'s

Same for having multiple $ inside the options part of the instructions.

bill_id=201720180SB1121$inurl,$boost=8

https://github.com/clening/DataProtectionGoggle/blob/19a3f9fd4bc9543c8f45ebfe4d0780826dc0ac18/dataprotection.goggle#L85

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

None detected


  • Check this box to trigger a request for Renovate to run again on this repository

Feature Request: Better clarity on error messages

I've recently created a Goggle and notice that the error messages displayed are often unhelpful to a user (especially to someone who isn't a software engineer. Case in point:

Line 70: Invalid character in site value

Here's my line 70: $boost=10,$site=^ca.gov

Line 77: There was an error processing your Goggle

$boost=10,$site=lis.virginia.gov

It would be helpful if errors were a bit more verbose - for example, what character is making Goggles sad? For Line 77, is there an error code from the website, or is this a formatting issue?

Suggestion Please consider adding more verbose error handling.

Brave Search doesn't know what a Brave Goggle is

Thank you for offering this service for free. It's exciting to be able to finally have a fully customized search engine.

I've found out something that you might want to fix, though:

brave

Also, is there a page with the full list of URL parameters that Brave Search supports? I'd like to create separate browser shortcuts for searching for pages in different regions.

Support Gitea instances

The ability to create a goggle on a gitea instance and not be limited to github and gitlab.

Gitea is a self hosted open source alternative to github and gitlab.

{

    {
"user_badges":[],
"user":{
    "id":258866,
    "username":"Dalton3412",
    "name":"Dalton bennett",
    "avatar_template":"https://avatars.discourse-cdn.com/v4/letter/d/b5a626/{size}.png",
    "email":"[email protected]",
    "secondary_emails":[],
    "unconfirmed_emails":[],
    "last_posted_at":null,
    "last_seen_at":"2022-10-25T16:27:51.435Z",
    "created_at":"2022-10-23T04:43:25.596Z",
    "ignored":false,
    "muted":false,
    "can_ignore_user":false,
    "can_mute_user":false,
    "can_send_private_messages":true,
    "can_send_private_message_to_user":false,
    "trust_level":0,
    "moderator":false,
    "admin":false,
    "title":null,
    "badge_count":0,
    "custom_fields":{},
    "time_read":251,
    "recent_time_read":251,
    "primary_group_id":null,
    "primary_group_name":null,
    "flair_group_id":null,
    "flair_name":null,
    "flair_url":null,
    "flair_bg_color":null,
    "flair_color":null,
    "featured_topic":null,
    "pending_posts_count":0,
    "bio_excerpt":null,
    "bio_cooked":null,
    "can_edit":true,
    "can_edit_username":true,
    "can_edit_email":true,
    "can_edit_name":true,
    "uploaded_avatar_id":null,
    "has_title_badges":false,
    "pending_count":0,
    "profile_view_count":0,
    "second_factor_enabled":false,
    "second_factor_backup_enabled":false,
    "associated_accounts":[],
    "can_upload_profile_header":true,
    "can_upload_user_card_background":true,
    "locale":"en",
    "muted_category_ids":[],
    "regular_category_ids":[],
    "watched_tags":[],
    "watching_first_post_tags":[],
    "tracked_tags":[],
    "muted_tags":[],
    "tracked_category_ids":[],
    "watched_category_ids":[],
    "watched_first_post_category_ids":[],
    "system_avatar_upload_id":null,
    "system_avatar_template":"https://avatars.discourse-cdn.com/v4/letter/d/b5a626/{size}.png",
    "muted_usernames":[],
    "ignored_usernames":[],
    "allowed_pm_usernames":[],
    "mailing_list_posts_per_day":227,
    "can_change_bio":true,
    "can_change_location":true,
    "can_change_website":true,
    "can_change_tracking_preferences":true,
    "user_api_keys":null,
    "user_auth_tokens":[
        {
            "id":468674,
            "client_ip":"74.199.7.179",
            "location":"Warren, Michigan, United States",
            "browser":"Google Chrome",
            "device":"iPhone",
            "os":"iOS",
            "icon":"fab-apple",
            "created_at":"2022-10-23T04:43:48.904Z",
            "seen_at":"2022-10-25T16:23:23.884Z"
        }
    ],
    "user_notification_schedule":{
        "enabled":false,
        "day_0_start_time":480,
        "day_0_end_time":1020,
        "day_1_start_time":480,
        "day_1_end_time":1020,
        "day_2_start_time":480,
        "day_2_end_time":1020,
        "day_3_start_time":480,
        "day_3_end_time":1020,
        "day_4_start_time":480,
        "day_4_end_time":1020,
        "day_5_start_time":480,
        "day_5_end_time":1020,
        "day_6_start_time":480,
        "day_6_end_time":1020
    },
    "use_logo_small_as_avatar":false,
    "date_of_birth":null,
    "accepted_answers":0,
    "featured_user_badge_ids":[],
    "invited_by":null,
    "groups":[
        {
            "id":10,
            "automatic":true,
            "name":"trust_level_0",
            "display_name":"trust_level_0",
            "user_count":182528,
            "mentionable_level":0,
            "messageable_level":0,
            "visibility_level":1,
            "primary_group":false,
            "title":null,
            "grant_trust_level":null,
            "has_messages":false,
            "flair_url":null,
            "flair_bg_color":null,
            "flair_color":null,
            "bio_cooked":null,
            "bio_excerpt":null,
            "public_admission":false,
            "public_exit":false,
            "allow_membership_requests":false,
            "full_name":null,
            "default_notification_level":3,
            "membership_request_template":null,
            "members_visibility_level":0,
            "can_see_members":true,
            "publish_read_state":false
        }
    ],
    "group_users":[
        {
            "group_id":10,
            "user_id":258866,
            "notification_level":3,
            "owner":false
        }
    ],
    "user_option":{
        "user_id":258866,
        "mailing_list_mode":false,
        "mailing_list_mode_frequency":1,
        "email_digests":true,
        "email_level":1,
        "email_messages_level":0,
        "external_links_in_new_tab":false,
        "color_scheme_id":null,
        "dark_scheme_id":null,
        "dynamic_favicon":false,
        "enable_quoting":true,
        "enable_defer":false,
        "digest_after_minutes":10080,
        "automatically_unpin_topics":true,
        "auto_track_topics_after_msecs":300000,
        "notification_level_when_replying":2,
        "new_topic_duration_minutes":2880,
        "email_previous_replies":2,
        "email_in_reply_to":false,
        "like_notification_frequency":1,
        "include_tl0_in_digests":false,
        "theme_ids":[
            12
        ],
        "theme_key_seq":0,
        "allow_private_messages":true,
        "enable_allowed_pm_users":false,
        "homepage_id":null,
        "hide_profile_and_presence":false,
        "text_size":"normal",
        "text_size_seq":0,
        "title_count_mode":"notifications",
        "timezone":"America/Detroit",
        "skip_new_user_tips":false,
        "default_calendar":"none_selected",
        "oldest_search_log_date":null
    }
}

}

Originally posted by @Harlee1994 in brave/brave-browser#26247 (comment)

Feature Request: Please add date instructions to allow for timeliness bias

Adding the ability to weight instructions using time-based mechanics would be a tremendous addition to Brave Search Goggles. For example, I may want to flag discussions about API functionality from the past 12 months above standard documentation in case there are any concerns I may need to know.

Is there a generic `downrank` (like the generic `discard`)?

Instead of discarding non-matched sites entirely (which could exclude useful results too), I tried downranking them all with a

$downrank

statement. This works for the $discard instruction, as shown in some of the example goggles and mentioned here. But based on some experimentation, this doesn't really seem to work when it's the downrank instruction being used this way. As far as I can tell, this instruction is just being silently ignored.

Is there a generic downrank feature that I'm just not seeing the effect of in my tests? Or is it not implemented? If it's not yet implemented, please consider this a feature request for a catch-all generic downrank.

Behavior of default option values: `inurl`

Quoting the quickstart:

By default any instruction will apply to a URL

Does this mean that the following are the same?

pattern
pattern$inurl

Is there a way to don't match URL and for example only the content?

pattern$incontent

Or is this still equal to:

pattern$incontent,inurl

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.