quantified-uncertainty / metaforecast Goto Github PK

Fetch forecasts from prediction markets/forecasting platforms to make them searchable. Integrate these forecasts into other services.

Home Page: https://metaforecast.org/

License: MIT License

JavaScript 1.89% Shell 1.19% Procfile 0.02% TypeScript 95.99% CSS 0.30% HCL 0.60%

forecasts prediction-markets forecasting-platforms

metaforecast's Introduction

What this is

Metaforecast is a search engine for probabilities from various prediction markes and forecasting platforms. Try searching "Trump", "China" or "Semiconductors".

This repository includes the source code for both the website and the library that fetches forecasts needed to replace them. We also aim to provide tooling to integrate metaforecast with other services.

How to run

1. Download this repository

$ git clone https://github.com/quantified-uncertainty/metaforecast
$ cd metaforecast
$ npm install

2. Set up a database and environment variables

You'll need a PostgreSQL instance, either local (see https://www.postgresql.org/download/) or in the cloud (for example, you can spin one up on https://www.digitalocean.com/products/managed-databases-postgresql or https://supabase.com/).

Environment can be set up with an .env file. You'll need to configure at least DIGITALOCEAN_POSTGRES.

See ./docs/configuration.md for details.

3. Actually run

After installing and building (npm run build) the application, npm run cli starts a local CLI which presents the user with choices. If you would like to skip that step, use the option name instead, e.g., npm run cli wildeford.

npm run next-dev starts a Next.js dev server with the website on http://localhost:3000.

So overall this would look like

$ git clone https://github.com/quantified-uncertainty/metaforecast
$ cd metaforecast
$ npm install
$ npm run build
$ npm run cli
$ npm run next-dev

4. Example: download the metaforecasts database

$ git clone https://github.com/quantified-uncertainty/metaforecast
$ cd metaforecast
$ npm install
$ node src/backend/manual/manualDownload.js

Integrations

Metaforecast has been integrated into:

Twitter, using our @metaforecast bot
Global Guessing, which integrates our dashboards
Fletcher, a popular Discord bot. You can invoke metaforecast with !metaforecast search-term
Elicit, which uses GPT-3 to deliver vastly superior semantic search (as opposed to fuzzy word matching). If you have access to the Elicit IDE, you can use the action "Search Metaforecast database. This is not being updated regularly.

We also provide a public database, which can be accessed with a script similar to this one. We are also open to integrating our Elasticsearch instance with other trusted services (in addition to Fletcher.)

In general, if you want to integrate metaforecast into your service, we want to hear from you.

Code layout

frontend code is in src/pages/, src/web/ and in a few other places which are required by Next.js (e.g. root-level configs in postcss.config.js and tailwind.config.js)
various backend code is in src/backend/
fetching libraries for various platforms is in src/backend/platforms/
rudimentary documentation is in docs/

What are "stars" and how are they computed

Star ratings—e.g. ★★★☆☆—are an indicator of the quality of an aggregate forecast for a question. These ratings currently try to reflect my own best judgment and the best judgment of forecasting experts I've asked, based on our collective experience forecasting on these platforms. Thus, stars have a strong subjective component which could be formalized and refined in the future. You can see the code used to decide how many stars a forecast should get by looking at the function calculateStars() in the files for every platform here.

With regards the quality, I am most uncertain about Smarkets, Hypermind, Ladbrokes and WilliamHill, as I haven't used them as much. Also note that, whatever other redeeming features they might have, prediction markets rarely go above 95% or below 5%.

Tech stack

Overall, the services which we use are:

Elasticsearch for search
Vercel for website deployment
Heroku for background jobs, e.g. fetching new forecasts
Postgres on DigitalOcean for database

Various notes

This repository is released under the MIT license. See LICENSE.md
Commits follow conventional commits
For elicit and metaculus, this library currently filters out questions with <10 predictions.
The database is updated once a day, at 3:00 AM UTC, with the command ts-node -T src/backend/flow/doEverythingForScheduler.ts. The frontpage is updated after that, at 6:00 AM UTC with the command ts-node -T src/backend/index.ts frontpage. It's possible that either of these two operations makes the webpage briefly go down.

To do

metaforecast's People

Contributors

Stargazers

Watchers

Forkers

goncaloperes greenboat11 berekuk nikosbosse uvafan

metaforecast's Issues

UX changes to dashboards

image

Hide the form from the dashboard page. Maybe add a button saying "create your own dashboard"
Have a sentence saying that dashboards cannot be changed after they are created.

Taken from quantified-uncertainty/metaforecast-frontend#12

Try Terraform

Arguments in favor of Terraform:

terraform would help with reproducible infra
- terraform configs automatically document how everything is set up and are always up-to-date since they serve a a single source of truth
it would prevent most wire-everything-up-properly issues (e.g. forgetting to set up an env variable on Heroku or Vercel)
it would help with setting up dev environments from scratch too

I don't want to spend too much time on this, but I think it's worth an hour or two. Going to set up a small proof-of-concept config for one aspect, e.g. Vercel, and then continue to improve from there from time to time.

Save resolutions

Write parsers to also fetch resolved questions. In the case of e.g., Metaculus, this is trivial. In the case of Good Judgment Open, this would require writing new parsers.

This turns out to be important, because we'll want to look at resolutions to compare different platforms in the future.

Automatically post new questions on twitter?

Nathan Young mentions:

I think there would be a lot more @metaculus chat on Twitter if we all were automatically posting each question to a thread and each thread was put in a personal megathread

Sounds plausible, seems doable. This would just require there to be a database with new questions which the twitter bot could fetch.

Api not working

Right now, https://metaforecast.org/api/questions returns

{
  "errorMessage": "Response payload size exceeded maximum allowed payload size (6291556 bytes).",
  "errorType": "Function.ResponseSizeTooLarge"
}

My guess is that this issue is kinda unavoidable, because the api fetches stuff on the server side. I'm thinking we might want to go back to api.metaforecast.org. Not sure how using graphql changes this.

Good Judgment Open typo

See: 8553dfa#diff-ecb10100efafa655c8e01dd6c70f025003b4a2f1934f93252c26ac21e57a7631R41

Build frontpage table by storing ids only

Right now frontpage table is filled with huge json fields.

It'd be better to keep a simple table with 50 ids, and then join it with questions in getFrontpage. I'm pretty sure the performance won't degrade.

(And frontpage_full is entirely unnecessary, since the same effect can be achieved with SELECT * FROM questions).

PS: This is not really important, but it's part of the work I'd like to do on normalizing the database.

[semi-urgent] Footer display in dsahboards is fucked up

See e.g., https://metaforecast.org/dashboards?dashboardId=ad190968c6, or the dashboards embedded in https://globalguessing.com/russia-ukraine-forecasts/

Estimated timeline until graphql API?

Hey @berekuk, someone recently asked for a timeline for the metaforecast graphql API. Thoughts, e.g., as a 90% confidence interval?

[Umbrella] New DB layer

"Merge platform tables" is the new one here. I think it would be better if we just stored all forecasts in a single forecasts (currently combined) table.

There's also a possible further roadmap with normalizing the database structure, i.e. extracting most of JSON fields into separate tables. But that can be done later in a separate issue.

Time estimate: uh, 5-10 hours, I guess? Maybe less. Prisma client might take a bit more time, but all these steps seem quite straightforward to me.

Scheduler on Heroku OOMs (critical)

latest.combined haven't updated since 26th:

metaforecastpg=> select date(timestamp) as d, count(1) from latest.combined group by d;
     d      | count
------------+-------
 2022-03-26 |  4807
(1 row)

Logs:

2022-03-27T11:59:04.388668+00:00 app[scheduler.4186]: ****************************
2022-03-27T11:59:04.388689+00:00 app[scheduler.4186]: polymarket
2022-03-27T11:59:04.388707+00:00 app[scheduler.4186]: ****************************
2022-03-27T11:59:04.388900+00:00 app[scheduler.4186]: Initial try
2022-03-27T11:59:18.435686+00:00 app[scheduler.4186]:
2022-03-27T11:59:18.435693+00:00 app[scheduler.4186]: <--- Last few GCs --->
2022-03-27T11:59:18.435694+00:00 app[scheduler.4186]:
2022-03-27T11:59:18.435696+00:00 app[scheduler.4186]: [4:0x4e3b870] 10716253 ms: Scavenge (reduce) 254.0 (257.6) -> 253.9 (258.4) MB, 1.9 / 0.0 ms  (average mu = 0.975, current mu = 0.863) allocation failure
2022-03-27T11:59:18.435697+00:00 app[scheduler.4186]: [4:0x4e3b870] 10716319 ms: Mark-sweep (reduce) 254.9 (258.4) -> 254.8 (259.4) MB, 57.1 / 0.0 ms  (+ 0.4 ms in 14 steps since start of marking, biggest step 0.1 ms, walltime since start of marking 126 ms) (average mu = 0.946, current mu = 0.631) allocation
2022-03-27T11:59:18.435702+00:00 app[scheduler.4186]:
2022-03-27T11:59:18.435702+00:00 app[scheduler.4186]: <--- JS stacktrace --->
2022-03-27T11:59:18.435702+00:00 app[scheduler.4186]:
2022-03-27T11:59:18.436392+00:00 app[scheduler.4186]: FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
2022-03-27T11:59:18.443246+00:00 app[scheduler.4186]: 1: 0xb09980 node::Abort() [node]
2022-03-27T11:59:18.444078+00:00 app[scheduler.4186]: 2: 0xa1c235 node::FatalError(char const*, char const*) [node]
2022-03-27T11:59:18.444720+00:00 app[scheduler.4186]: 3: 0xcf784e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2022-03-27T11:59:18.445464+00:00 app[scheduler.4186]: 4: 0xcf7bc7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2022-03-27T11:59:18.446186+00:00 app[scheduler.4186]: 5: 0xeaf465  [node]
2022-03-27T11:59:18.446934+00:00 app[scheduler.4186]: 6: 0xebf12d v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
2022-03-27T11:59:18.447900+00:00 app[scheduler.4186]: 7: 0xec1e2e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
2022-03-27T11:59:18.448632+00:00 app[scheduler.4186]: 8: 0xe830a2 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
2022-03-27T11:59:18.449686+00:00 app[scheduler.4186]: 9: 0xe7b6b4 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [node]
2022-03-27T11:59:18.450613+00:00 app[scheduler.4186]: 10: 0xe7d3c0 v8::internal::FactoryBase<v8::internal::Factory>::NewRawOneByteString(int, v8::internal::AllocationType) [node]
2022-03-27T11:59:18.451589+00:00 app[scheduler.4186]: 11: 0xfa14c9 v8::internal::JsonParser<unsigned short>::MakeString(v8::internal::JsonString const&, v8::internal::Handle<v8::internal::String>) [node]
2022-03-27T11:59:18.452739+00:00 app[scheduler.4186]: 12: 0xfa359d v8::internal::JsonParser<unsigned short>::ParseJsonValue() [node]
2022-03-27T11:59:18.453848+00:00 app[scheduler.4186]: 13: 0xfa3d2f v8::internal::JsonParser<unsigned short>::ParseJson() [node]
2022-03-27T11:59:18.454520+00:00 app[scheduler.4186]: 14: 0xd7973b v8::internal::Builtin_JsonParse(int, unsigned long*, v8::internal::Isolate*) [node]
2022-03-27T11:59:18.455444+00:00 app[scheduler.4186]: 15: 0x15f0bf9  [node]
2022-03-27T11:59:18.608389+00:00 heroku[scheduler.4186]: Process exited with status 134
2022-03-27T11:59:18.670672+00:00 heroku[scheduler.4186]: State changed from up to complete

(it also failed today due to .js/.ts misconfiguration, but I fixed that already)

I'm looking into ways to optimize the memory usage now, but we might also want to increase the dyno size on Heroku in the meantime?

Incorporate dashboards and forecasts into lesswrong/EA Forum.

Previously, Habryka expressed some interest on incorporating dashboards into the EA Forum/Lesswrong. Before doing that, I think it would be nice it to:

Make the backend for dashboards more robust (the graphql changes should suffice)
- Separately, dashboards are now being indexed by the hash of the forecasts, rather than by the hash of the forecast-ids. Should be easy to change.
Make dashboards searchable by name?
Make it easier to embed questions.
Deal with the edge-case where forecasts have resolved (possibly fetch resolutions; see #10)
- Right now, resolved questions simply disappear from dashboards.

I imagine that people embedding individual forecasts would also want to display the history (#28), but that's a separate feature.

x-risk database forecasts gone missing in frontend

E.g., try searching for something in https://docs.google.com/spreadsheets/d/1W10B6NJjicD8O0STPiT3tNV3oFnT8YsfjmtYR8RO_RI/edit#gid=0

GraphQL RFC

I'll explain my opinions and decisions I'm planning to make regarding the upcoming GraphQL API.

Some of these should IMO be done upfront, some are more like a future roadmap and braindump of my previous experience.

Short version:

let's use urql, nexus and graphql-code-generator
mutations, inputs and error handling conventions are important from the start
we need to decide on API stability guarantees, but this can be done later

(long version)

We face multiple choices:

which client-side library to use for graphql requests
how to implement graphql server on the backend side of API
how to handle non-trivial parts of API, e.g. mutations and errors
how much of backward compatibility we're going to provide for third-party users

1. Client-side library

There are multiple possible options for how to do graphql queries from the frontend.

Most of my own experience with Apollo Client, and I have some experience with using urql in a minor unfinished project (both are pretty powerful and include a normalized document cache; both seem pretty much equivalent to me). IMO Apollo Client's implementation is bloated and I'd go with urql for any new project.

We could also live on just graphql-request (which is already used in the codebase on the backend) or react-query for a while, but eventually any non-trivial graphql codebase requires a normalized document cache, so I propose we go with urql to avoid later refactorings.

I'd also like to set up graphql-code-generator immediately; it makes the overall dev experience much nicer.

2. GraphQL server

It's possible to write a graphql server from scratch with graphql-js, but that's quite low-level and annoying.

Prisma (which we probably will adopt later as an ORM, but that's a different topic) recommends three solutions: graphql-tools, Nexus and TypeGraphQL.

Graphql-tools is SDL-first (first you write GraphQL schema and then you implement the methods listed in schema); in my experience SDL-first approach can become limiting, e.g. if we list all platforms in code and then want to turn them into graphql enum then we'd have to do it by hand, or deal with string templates to generate an SDL from code.

So I prefer code-first approach (I've tried both). This leaves us with Nexus vs TypeGraphQL. I haven't tried TypeGraphQL, but I remember evaluating this choice in the past (a year ago or so) and deciding that Nexus is a safer choice, though TypeGraphQL APIs might be more convenient, with more decorator magic. I've used Nexus in a few small projects in the last year and had no significant issues.

Three minor issues I had with Nexus:

It's kind of weird how it regenerates the types based on code, so you often type the code which is not valid in typescript's opinion, then you save it, then typescript decides that it's ok; this is not a bug, though, it's how it's supposed to work with Nexus, and overall the experience is nice when you get used to it.
Its integration with Prisma was deprecated and there were no good solution for some time; but looks like they have recently updated the https://nexusjs.org/docs/adoption-guides/prisma-users page so it should be better.
Its source types is hard to wrap your head around first, but they might be doing as good as possible there, considering that source type vs GraphQL type issue is intrinsic with how GraphQL resolvers always work; at least they provide several different ways to configure types matching (see the link if you want to know more), so it's hard to draw yourself into a corner.

I lean towards just going with Nexus, but considering that GraphQL server usually requires a lot of repetitive code and the cost of switching to a different framework later is O(number-of-implemented-fields), maybe I should evaluate TypeGraphQL vs Nexus (vs something else?) for a few hours first before settling on a decision.

Other considerations:

current database structure is not really GraphQL-friendly since most data is stored in JSON fields; it might make sense to implement the really basic GraphQL API first, then work on Prisma/ORM refactoring (I think the entire DB layer of metaforecast should be rewritten, btw), and then expand on GraphQL
dataloaders are great for avoiding N+1 performance issues in any non-trivial GraphQL server; we'll get them when we get Prisma (if we settle on Prisma)
auth can be done by cookie initially and by user-generated tokens later when we get user accounts

3. API design

I'll describe my previous experience with GraphQL API design here. I probably put more details here than necessary, but this can be the draft for the future coding style docs, so I hope it's not a waste.

GraphQL queries are mostly straightforward and easy to evolve.

GraphQL mutations are a bit more tricky: there's a temptation of cutting the corners and doing something like:

type Mutation {
  createForecast(title: String!, description: String) Boolean
}

Issues with this example:

First, there are just two input fields for now, but it might grow up to ten or more.

And then you'd have to query it like this, even for two fields:

const q = gql```
mutation CreateForecast($title: String!, $description: String!) {
  createForecast(title: $title, description: $description)
}
```;

graphqlClient.request(q, { title, description });

It gets annoying really quickly.

The solution for this is to create Input types for every mutation, even if it feels excessive. Having common conventions for everything is great, and backend code can be simplified with helper functions to avoid copy-paste (this is one of the reasons why code-based is better than SDL-based).

input CreateForecastInput {
  title: String!
  description: String!
}

type Mutation {
  createForecast(input: CreateForecastInput!) Boolean
}

I'm currently agnostic on whether the input pattern is needed for non-mutation fields. Probably not by default, since those inputs usually evolve much slower.

Second, return values. You might think initially that your mutation is trivial and you don't need to return anything, but it's hard (read "impossible without breaking prod") to change a graphql field from scalar value to object.

Also, if you return an "obvious" object (e.g., Forecast in case of createForecast example), then you won't be able to return anything else beside it.

The best practice for this is to always create a Result object for every GraphQL mutation.

So:

type CreateForecastResult {
  forecast: Forecast
  error: String
}

type Mutation {
  createForecast(input: CreateForecastInput!) CreateForecastResult!
}

Or, maybe:

union CreateForecastResult = Forecast | GenericError

type Mutation {
  createForecast(input: CreateForecastInput!) CreateForecastResult!
}

Or:

type CreateForecastOkResult {
  forecast: Forecast!
  # room for more fields
}

union CreateForecastResult = CreateForecastOkResult | GenericError

type Mutation {
  createForecast(input: CreateForecastInput!) CreateForecastResult!
}

I'm unsure if the last one is not an overkill, but all of these are better than returning a scalar or something like { ok: true }.

This might feel like too much stuff for every minor mutation, but... well, I tried to cut these corners and didn't like the consequences.

Third, error handling. As can be seen in the examples above, I'm returning a GenericError object instead of relying on GraphQL native errors. That's because native GraphQL errors suck and everyone agrees that you shouldn't rely on them if code didn't fail through an exception. This is a large topic and I won't expand on it here, but basically we should treat error objects as first-class citizens in our API design.

Fourth, on nulls. This is a topic on which I diverge from the mainstream opinions in GraphQL community, or at least from the opinions of the initial GraphQL standard authors. Standards basically say "mark fields as non-nullable only when you know what you're doing".

This helps with graceful degradation when your project is built from multiple backend microservices where each microservice provides its chunk of data and can fail separately; but in my experience it's too much pain to no clear benefit to check every field if it's null on the frontend, and I prefer to just mark everything with !s by default.

Fifth, on pagination. Don't have a custom opinion here, let's just use Connections. Again, in my previous project I tried to cut this corner with custom page: $id inputs and had to refactor later. Relay-style pagination is clunky but other approaches are worse.

4. Stability guarantees

GraphQL is easy to evolve, but when we get third-party clients eventually, we'll have to avoid changing it too frequently.

This means we'll have to go through the "deprecate a field, wait for a few weeks or months, remove the field" dance for all refactorings.

Depending on how large the project is going to be, this might mean setting up a mailing list for devs and sending notifications about backward-incompatible API changes there.

Having some statistics on "how many clients have requested this field over last N days" might be nice too. Apollo Studio does this, but I don't have any experience with it since it's a bit costly and seems too much like a corporate lock-in. This might be done by hand with GraphQL tracing tools.

But all this is just stuff for later discussions and we don't need to worry about it for now, I guess.

Picking a platform

If we do my proposal from #4, which platform should we pick for the main Next.js instance?

Top options for me are: Vercel, DO and Netlify.

My own preferences:

Vercel IMO is awesome and zero-conf serverless is great, but $40 for two-people team is a bit costly
- I’m pretty sure that Vercel is enough for everything (if we keep an external Postgres instance on DO, or maybe use something like https://supabase.com/)
- previews and built-in analytics are nice
Digital Ocean is fine too; I could spend some time formalizing the deployment (e.g. with ansible or just shell scripts, or with docker), but I understand that's not a priority right now
- depending on how much metaforecast will grow in the future, auto-scaling would be nice to have; though I don't know how much traffic metaforecast gets right now but it could probably scale up to 100x on a single instance easily
- SLA (not going down due to single server issues or deployment accidents) might be more important than the load-balancing then, and it's easier to achieve with Vercel
- flexibility is higher with self-managed VPS, but I'm not sure we'll ever need it
  - Vercel plays nice re: platform lock-ins, so it'd be easy to move back to VPS
Netlify is probably a slightly worse choice than Vercel just because Vercel owns Next.js, but I don't have that much experience with Netlify

Figure out how to bring env & secrets to Netlify/Vercel/our hosting platform.

As I mentioned in #8 (comment), there's an issue with using env variables as a source of configuration: AWS Lambda doesn't allow env larger than 4kb.

The same issue would follow us on Vercel, and their workaround article on this is too hacky.

I'm actually surprised that I couldn't google up tons of blog posts about this, there are some threads on Netlify forum, but I expected "I need to bring my configs to production and don't want to commit them to the public github repo" to be a common use case.

Some solutions I've thought up so far:

implement a custom build step on Netlify which would run curl https://my-secret-url/env.production?token=supersecrettoken >.env (I don't like this, exposing secrets via GET request, even with a secret url; is... well, technically it's ok, but feels hacky and risky; also, there's a question of where to host it).
the same solution with fetching secrets on build, but with an external secrets provider such as AWS Secrets Manager; this would add an extra dependency.
we could move all platform cookies to the database and keep a tiny enough env file; technically only DIGITALOCEAN_POSTGRES could be secret and everything else we could store in the DB or commit to the repo.
I think it might also possible to deploy to netlify through the github actions; I don't have any experience with github actions, though, and not sure (my CI/CD experience is on gitlab, which is somewhat different); I expect this to be a viable route, but too complicated compared to the current "netlify pulls everything with zero configuration" approach.

Out of all these, I think (3) is the best route.

Especially since we might like to store cookies in the DB for other reasons anyway. For example, if some platform has some short-lived cookies then we'll want to store login+password in the configuration, and a cookie somewhere else.

Extract consensus forecast from similar questions.

Proposal in Spanish:

Me pregunto si consideraste la posibilidad de extender Metaforecast para que extraiga un "consensus forecast" y luego identifique desviaciones significativas en los mercados de apuestas? Supongo que el principal obstáculo es la clasificación de las distintas preguntas entre aquellas que expresan la misma proposición con distinto lenguaje. Pero este proceso se podría automatizar, delegándolo a un asistente humano (yo conozco una persona a quien se le podría pagar para que se encargue de esta tarea).

Note: could be automated for dashboards easily?

Optimize database upserts.

Right now upserts are done really inefficiently: the whole latest.platform_name database is deleted, and forecasts are inserted one by one. This is inefficient and could be much improved.

It's not clear whether it's worth it to optimize on postgres directly, rather than waiting until the graphql code is set up, and in fact I'm leaning towards the later.

Suggestion: Generate image previews using Vercel

Saw that your images seem to include an upload-to-imgur step that's running into API limits. Obviously, you could upload to a cloud service (GCP, AWS) instead, but I'd actually suggest rendering a dynamic image using Vercel -- it's actually really easy!

Manifold did this basically by copying https://github.com/vercel/og-image and tweaking it, resulting in a nice preview image that gets rendered whenever a link is posted on Twitter/Discord/etc. Happy to share our code if it'd help, or explain more!

(Sidenote: from a UX perspective, would be cool if Capture was a thing you could get by clicking on an "image"/"share" icon directly from the main Metaforecast site; right now, it's fairly hard to discover)

Terminology

So, the "forecast" word is a kinda ambiguous, it can mean either an entire topic, or a single prediction on a given topic.

This ambiguity is already present in the current codebase: "forecast" in numforecasts has a different meaning from the Forecast type. Or maybe I should've named the type differently?

Word "Option" from Forecast.options seems sub-optimal too, since it's so generic.

Maybe "Topic" or "Question" for questions and "Outcome" for options would be clearer? But I don't have any good ideas for response-on-topic ("bet" is too finance-related, "forecast" is still ambiguous, "prediction" is ok, I guess).

@NunoSempere, you have more experience with various platforms and their terminology, what do you think?

I'm writing this now because I'm doing DB migrations from #33, and if we merge all platform tables into a single table then we should change its name from combined to something else; also, we need to settle on a stable terminology before exposing any public APIs.

[Umbrella] GraphQL

Continuing from #21, here's the umbrella issue for adopting GraphQL.

decide on Nexus vs TypeGraphQL
scaffolding: GraphQL Playground
scaffolding: basic server side code with Nexus/TypeGraphQL (even if empty / with one trivial method)
scaffolding: graphql-code-generator
scaffolding: urql (might require writing a few utility wrapper components)
convert existing REST APIs to GraphQL (dashboard, frontpage, squiggle)

All this is going to take me ~10-15 hours, I think (unless I pick TypeGraphQL, then maybe more since I'm not familiar with it).

This issue doesn't require adopting Prisma, but I might do them both in parallel, not sure yet.

(If any of these subtasks will become large enough then I'll extract them into separate issues and link from here, hence "umbrella".)

Get rid of postgres schemas - ok?

I was surprised that the current codebase uses PG schemas (latest and history) to separate table namespaces.

I admit I don't have much experience with postgres (most of my experience is with mysql), so it might be that, but also I couldn't understand why it's necessary.

I planned to just go with the flow and adopt this instead of complaining, but then I tried to write a PR which uses Prisma Migrate for database migrations (since managing it by hand with pgInitizalize* functions and YOLO flags is becoming unhandy)... and it turns out Prisma doesn't support tables in multiple schemas. Or at least doesn't mention it in the docs.

Which updated me in the direction of "maybe it's not the best practice for postgres either", so I googled around a bit and I think people mostly say that schemas are for multi-tenant cases or for multiple apps, not for a single app with namespacing needs (which are better managed by underscore prefixes).

So maybe we should just move everything to public?

Create a map of all services we are using.

This pull request introduces Terraform, and argues for using Terraform cloud, but mentions that this would imply adding yet another service.

I would feel more comfortable adding a new service with a map of how these services fit together. Here is an example from a small previous project:

Thoughts?

Typescript?

Can I (gradually) convert everything to Typescript? :)

It seems to me that TS have "won" and everyone converts their projects to it eventually (unless they pick something like ReasonML or Elm).

The costs for converting to TS are proportional to the amount of code, so it might be better to start this early.

I think even with "noImplicitAny": false (e.g., most of JS code is a valid TS code) it still might be beneficial for safer refactorings and adopting types gradually.

Query can't be shortened to its prefix in URL

Steps to reproduce:

search "trump123"
remove "123"
observe that url still contains "trump123" (due to href.includes check in commonDisplay code)

(I'll fix this later)

Unify backend/frontend/API code?

Since getStaticProps and getServerSideProps code (and everything downstream of those) is not embedded in the frontend bundle, querying the DB directly from those is actually fine and recommended (see e.g. https://nextjs.org/docs/basic-features/data-fetching/get-server-side-props)

Unless I’m missing something, this means it’d be fine to unify all the code (frontend, backend and api) in a single Next.js app.

Secrets e.g. Postgres URL can be stored in env too (only the ones with NEXT_PUBLIC_ prefix are passed to the frontend, see https://nextjs.org/docs/basic-features/environment-variables).

If I understand correctly, the backend includes some kind of scheduler for updating the DB? (I haven't looked into the details yet). That can be handled with an API route and something like https://vercel.com/docs/concepts/solutions/cron-jobs (if we decide to use Vercel, see the next issue).

Metaforecast was briefly down

Metaforecast was briefly down, by guess as a result of merging #53, perhaps because of the same reason as this comment. I restored a previous version on netlify.

Metaculus fetch treats prediction as 0% when no community prediction

See e.g. "Will an additional state join NATO by 2024?" in https://metaforecast.org/dashboards?dashboardId=6fa27dad1e

Is deleting deprecated code ok?

I'm continuing my typescript refactorings and some code is too broken to be converted (platforms/deprecated/ stuff) and too dead to care about.

Is it ok if I just remove it from the repo? It can always be brought back from git archives if necessary.

Similar question about backend/utils/misc stuff, most of it seem like one-shot scripts which probably shouldn't be maintained indefinitely. But the intent is not always clear to me. And also flow/history/old.

[RFC] New event-based `history` storage

This is something I've had in mind for the last few days, but it's still incomplete.

Right now the history table is populated with question snapshots. This seems suboptimal:

We'll probably want to normalize the questions data eventually, e.g. extract options in a separate table (for performance and to unlock the possibility of more complex SQL queries), and it's unclear how to adapt the current history schema for that
history table is denormalized and includes a lot of duplicate data; also, it grows proportionally to the frequency with which we fetch the sources, and so doesn't play well with plans from #35
Performance will probably suffer too; this might affect #28, though I'm not sure by how much

Alternative: implement an event-based storage which tracks only the changes in fields.

E.g., list of fields for the new table:

pk (serial id)
question_id
field (can be title, description, stars)
value (new value)
timestamp

Unique index by question_id + field.

This table would be populated only if the field value has changed. If the field hasn't changed from the previous fetch then there's no need to save it again.

This proposal is incomplete:

it doesn't explain how to track "deep" properties, e.g. if question had a change in one of the option titles, it's unclear what to put in field
I'm still confused on "we just fetched the new question data with its entire forecasts history from the platform" (because the platform provides the historical data) vs "we fetched the new question data and store its snapshot in our history table" — these are two different scenarios, ideally we need to handle both and abstract it away from the end users

I'll think about this some more before doing any code changes, and I'll wait until I become more familiar with the specifics of different platforms that we support. Just throwing this idea out there to gestate for now.

Decide whether to proxy plausible.io queries, pay netlify for analytics, or do nothing.

Plausible was being queried through a netlify snippet injection (https://app.netlify.com/sites/metaforecast/settings/deploys#post-processing), which wasn't working. I changed this to use https://github.com/4lejandrito/next-plausible in _app.tsx. But plausible recommends proxying it instead (https://plausible.io/docs/proxy/introduction). We could do that, though it sounds kind of complicated.

Instead, we could pay netlify $9/month for better analytics (e.g., because netlify serves the webpage, its analytics can't be blocked by adblockers). Personally, I think it would be worth it, but it also hurts a little bit.

numDisplay + 50 causes unstable results

Steps to reproduce:

visit https://metaforecast.org/?query=trump&numDisplay=1
click "show more" at the bottom a few times
compare with https://metaforecast.org/?query=trump&numDisplay=200

This is due to "show more" not performing a real search, and searchAccordingToQueryData requesting just numDisplay + 50 results.

DB latency

Continuing from comments on #37.

Heroku<-> DO ping latency is still not good:

> ping({address: 'postgres-green-do-user-11243032-0.b.db.ondigitalocean.com', port: 25060}, (err, data) => {console.log(data)})
undefined
> {
  address: 'postgres-green-do-user-11243032-0.b.db.ondigitalocean.com',
  port: 25060,
  attempts: 10,
  avg: 73.2166266,
  max: 80.681642,
  min: 71.470983,
  results: [
    { seq: 0, time: 71.649331 },
    { seq: 1, time: 72.01795 },
    { seq: 2, time: 71.470983 },
    { seq: 3, time: 71.492197 },
    { seq: 4, time: 71.61182 },
    { seq: 5, time: 71.533632 },
    { seq: 6, time: 74.224676 },
    { seq: 7, time: 73.444478 },
    { seq: 8, time: 80.681642 },
    { seq: 9, time: 74.039557 }
  ]
}

DO instance was in SFO3, and Heroku is US East on AWS.

Turns out DO allows to change the region while keeping the DB available (what kind of black magic is this? I really didn't expect it to work, but I took the risk and changed it to NYC1), will check again after the migration.

Bettter logging and error checking

Right now, fetchers occasionally fail when the website is updated, when I forget to update cookies, etc. Right now, logging and error checking exists a bit (using Papertrail on Heroku). But it could be improved

Initial: Suggestions

Limit randomly displayed forecasts on homepage to collection of whitelisted forecasts
Change forecast update time until >12pm GMT such that you capture the daily updates from Good Judgment Inc
Display forecasts graphically with certain parameters (minimum, time: 24hr; 7d; 14d; 1mo; etc | also consider: log & delta probability)
In addition to current probability on forecast display, also show delta of Q probability given certain time parameters (24hr; 7d; 14d; 1mo; etc)
Editable dashboards (re-organize order of questions; add or remove questions from dashboard)
Sections in dashboards (topic: Russia-Ukraine; section 1 = sanctions Qs; section 2 = NATO Qs; etc.)
Display options for dashboards (toggle description text; number of columns; enable graphs for selected questions; etc)
Aggregate similar forecasts into a blended forecast given certain parameters (length of question; stars / subjective weights; etc) and feature said forecasts in dashboards
Organize successive forecasts into a single chart / display and feature in dashboards (e.g., 10K, 25K, 50K, 100K War Death Qs all together)
Select forecasts to compare, including plotting on same chart (e.g., probability of peace deal and forecasts on wheat prices)

Later: Updated front-end design

Figure out how to display forecast history

Right now, we are saving forecasts' history, but we are not doing much with it. It would be nice to display forecast evolution. But then we'd have to figure out how to display it (e.g., in forecast-specific pages? in the normal view?), which might be a bit tricky. If we have forecast-specific pages, we might also want to do some other stuff with it...

What is `INSERT INTO latest.frontpage(frontpage_full, frontpage_sliced) VALUES($1, $2)` doing, in `src/backend/frontpage.ts`?

I don't get what this line is doing, and the Postgres documentation isn't much help.

In particular, I'm confused by inserting two tables into another table. Just pointing me to some documentation on this is fine.

Heroku not automatically building from Github

See https://status.heroku.com/

Frontpage is slow

I occasionally get "over 10 seconds" timeout errors from the frontpage. Fetching the frontpage data is the culprit and it's pretty bad currently:

$ ab -n 10 https://metaforecast.org/api/frontpage
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking metaforecast.org (be patient).....done


Server Software:        Netlify
Server Hostname:        metaforecast.org
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-ECDSA-CHACHA20-POLY1305,256,256
Server Temp Key:        ECDH X25519 253 bits
TLS Server Name:        metaforecast.org

Document Path:          /api/frontpage
Document Length:        115 bytes

Concurrency Level:      1
Time taken for tests:   33.563 seconds
Complete requests:      10
Failed requests:        9
   (Connect: 0, Receive: 0, Length: 9, Exceptions: 0)
Non-2xx responses:      1
Total transferred:      1103648 bytes
HTML transferred:       1100869 bytes
Requests per second:    0.30 [#/sec] (mean)
Time per request:       3356.261 [ms] (mean)
Time per request:       3356.261 [ms] (mean, across all concurrent requests)
Transfer rate:          32.11 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      575  581   4.4    581     590
Processing:  1812 2775 2648.0   1883   10284
Waiting:     1229 2246 2833.0   1308   10284
Total:       2393 3356 2646.9   2469   10862

Percentage of the requests served within a certain time (ms)
  50%   2469
  66%   2485
  75%   2539
  80%   3149
  90%  10862
  95%  10862
  98%  10862
  99%  10862
 100%  10862 (longest request)

Not sure yet what's causing this. Selecting a single value from the DB, even a large one, shouldn't take that long.

Asking for frontpage_sliced from Heroku is ok:

defaultdb=> \timing
Timing is on.
defaultdb=> select frontpage_sliced from frontpage order by id desc limit 1;
Time: 31.995 ms

This might be related to the ping distance/latency (Netlify <-> DO) again. But then /api/dashboard-by-id would be even slower since it does multiple queries while /api/frontpage does a single one. (/api/dashboard-by-id it not great either, its median time is over 1s, but it's still fater). Need to investigate further.

Switch to Vercel

So, I was still annoyed by #44 and looked further, and it turns out that, uh, Netlify is just slow.

I spinned up a test instance on Vercel to compare, and the difference is huge.

Frontpage from my server in Europe:

# ab -n 30 'https://metaforecast.vercel.app/'
[...]

Percentage of the requests served within a certain time (ms)
  50%    334
  66%    367
  75%    454
  80%    520
  90%    617
  95%    797
  98%    823
  99%    823
 100%    823 (longest request)

# ab -n 30 'https://metaforecast.org/'
[...]

Percentage of the requests served within a certain time (ms)
  50%   1369
  66%   1396
  75%   1400
  80%   1429
  90%   1516
  95%   1521
  98%   1522
  99%   1522
 100%   1522 (longest request)

Frontpage from a temporary DO droplet in NYC1:

#  ab -n 30 'https://metaforecast.vercel.app/'
[...]

Percentage of the requests served within a certain time (ms)
  50%    167
  66%    183
  75%    209
  80%    238
  90%    265
  95%    332
  98%   4294
  99%   4294
 100%   4294 (longest request)

(4294ms response is an outlier, I left this run as an example, but most of my runs with -n 30 give ~300 ms longest request time)

# ab -n 30 'https://metaforecast.org/'
[...]

Percentage of the requests served within a certain time (ms)
  50%    683
  66%    700
  75%    720
  80%    733
  90%    815
  95%    897
  98%    993
  99%    993
 100%    993 (longest request)

Static pages, e.g. /about (from NYC1):

# ab -n 30 'https://metaforecast.vercel.app/about'
[...]

Percentage of the requests served within a certain time (ms)
  50%     42
  66%     43
  75%     45
  80%     51
  90%     62
  95%     75
  98%    115
  99%    115
 100%    115 (longest request)

# ab -n 30 'https://metaforecast.org/about'
[...]

Percentage of the requests served within a certain time (ms)
  50%    240
  66%    241
  75%    241
  80%    241
  90%    241
  95%    243
  98%    322
  99%    322
 100%    322 (longest request)

I don't know why Netlify is so slow, but I believe it's worth switching from.

Vercel is a bit costlier ($20 per team member vs $15/$19 on Netlify), but it has (mostly) better limits:

1 TB bandwidth per month instead of 400 GB on Netlify (though costlier additional bandwidth)
serverless function timeout is 60 seconds on Pro plan
24000 minutes of build time (vs just 300 or 1000 on Netlify, and we've been hitting these limits on Netlify already)

There are some other limits, it's hard to compare all of them, see https://vercel.com/pricing vs https://www.netlify.com/pricing, but I don't see anything that would be a problem until metaforecast grows 10x or more.

Create a storybook (https://storybook.js.org/) with reusable metaforecast components

Would be nice, https://storybook.js.org/, but isn't critical

[Umbrella/RFC] Improvements to the scheduler & platforms layer

So I took a closer look at how backend/platforms are implemented, and I have a few ideas on how to improve.

@NunoSempere, let me know if you're ok with all these or if you have other suggestions.

Separate fetching and storing layers more clearly (see #34)
CLI with names instead of numbers (current approach seems quite fragile to me)
#9
Unify platforms list on frontend and backend (will require figuring out how to avoid bringing backend platforms code to the frontend bundle, but I think it's doable with lazy loading or some other trick)
#35
Support incremental updates, as I mentioned in PS in my #30 (comment) comment

This is a longer term roadmap, and the last two items are a bit more complex, but I believe that they're quite beneficial. Getting closer to the "metaforecast data is never stale" goal would be great.

forecastingPlatforms in URL is broken

getServerSideProps from /index and /capture passes platformsWithLabels array of objects, and commonDisplay code expects an array of objects.

But when forecastingPlatforms is passes in URL, it's just a string of "Foo|Bar|Baz" which is not handled anywhere as far as I can see.

So, any URL such as https://metaforecast.org/?forecastingPlatforms=Betfair doesn't really filter by forecastingPlatforms, and also doesn't show platforms in advanced options' select widget.

npm errors

I'm getting a fair number of npm errors:

npm WARN deprecated [email protected]: See https://github.com/lydell/source-map-url#deprecated
npm WARN deprecated [email protected]: flatten is deprecated in favor of utility frameworks such as lodash.
npm WARN deprecated [email protected]: request-promise-native has been deprecated because it extends the now deprecated request package, see https://github.com/request/request/issues/3142
npm WARN deprecated [email protected]: Please see https://github.com/lydell/urix#deprecated
npm WARN deprecated [email protected]: this library is no longer supported
npm WARN deprecated [email protected]: This package has been deprecated and is no longer maintained. Please use @rollup/plugin-inject.
npm WARN deprecated [email protected]: https://github.com/lydell/resolve-url#deprecated
npm WARN deprecated [email protected]: See https://github.com/lydell/source-map-resolve#deprecated
npm WARN deprecated [email protected]: The querystring API is considered Legacy. new code should use the URLSearchParams API instead.
npm WARN deprecated [email protected]: Package no longer supported. Contact Support at https://www.npmjs.com/support for more info.
npm WARN deprecated [email protected]: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
npm WARN deprecated [email protected]: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated [email protected]: Please upgrade to @mapbox/node-pre-gyp: the non-scoped node-pre-gyp package is deprecated and only the @mapbox scoped package will recieve updates in the future
npm WARN deprecated [email protected]: core-js@<3 is no longer maintained and not recommended for usage due to the number of issues. Please, upgrade your dependencies to the actual version of core-js@3.
npm WARN deprecated [email protected]: core-js@<3.4 is no longer maintained and not recommended for usage due to the number of issues. Because of the V8 engine whims, feature detection in old core-js versions could cause a slowdown up to 100x even if nothing is polyfilled. Please, upgrade your dependencies to the actual version of core-js.

In addition, the project maybe somehow depends on npm.team.kocherga.club. Is this really necessary? I think this also ended up affecting the heroku build: https://dashboard.heroku.com/apps/metaforecast-backend/activity/builds/5760aab8-f054-484e-ae71-10eca5f1fe40

Integrate with search engines?

E.g., with searx, or with you.com, or duckduckgo?

REST or GraphQL?

If we do #4 then API might not even be necessary right now (it's necessary for in-page operations, but not for the single page data fetching).

So, do we even want REST API if we expect to eventually grow to GraphQL (which is also self-documented, which is a nice bonus)?

I could implement a rudimentary GraphQL endpoint which covers everything from https://github.com/QURIresearch/metaforecast-api-server/blob/master/index.js in a day or two, in case we want to share any kind of API for third-party clients ASAP.

Supporting both REST and GraphQL is a possibility too, but more costly in maintenance in the long run. So this depends on how many API clients we're expecting to get and how much effort we can afford to put into making the life easier for them (things to consider: backward compatibility, documentation, feature parity between APIs and between an API and the website).

Rootclaim fetcher not working

Rootclaim fetcher not working. This is not a priority, since they don't often add new forecasts

Dashboard ids hashing leads to unexpected UX

If I go to https://metaforecast.org/dashboards and create a dashboard with example ids but a modified title, I'll get someone else's dashboard (e.g., the default one), because INSERT in pgInsertIntoDashboard fails quietly and dashboard id is built from question ids only.

Is this a feature or a bug? Seems more like a bug to me, but I'm not entirely sure.

Maybe we should switch to the random uuids/cuids instead of relying on hashing?

Avoid flashing on initial load

Behavior: On initial site load, the site flashes. That is, the content is briefly shown, then it disappears, and then it's shown again.

Desired behavior: That doesn't happen.

You can see a video of this behavior on: https://imgur.com/a/hN50EhG

Steps to reproduce: Load a page you haven't searched for before on an incognito tab on Chrome.

Independent update schedules for different platforms

What I have in mind here: let's keep a platform_status DB table with a timestamp of the last update for each platform; ask every platform to update every minute in scheduler; and skip an update for a platform if its data is fresh enough ("fresh enough" definition can be specified by each platform code separately).

This would allow us to e.g. update polymarket once per day and metaculus every 30 minutes, or vice versa.

It would also give us an ability to force an update from web UI: mark a platform as "needs-to-be-updated" flag with a single button click and don't worry about Heroku.

And also a better observability: it'd be easy to build a (secret? login-only?) page listing all platforms and their current update status. We could also log an exception if platform update failed, store it in the same table, and display it on that web page.