wilsonfreitas / brasa Goto Github PK
View Code? Open in Web Editor NEWExtract Brazilian financial data from a wide range of Internet sources: B3, ANBIMA, CVM
License: MIT License
Extract Brazilian financial data from a wide range of Internet sources: B3, ANBIMA, CVM
License: MIT License
Create a table to manage all files: imported, downloaded, and produced.
Each generated file has its checksum and this is a key to this file and asserts its uniqueness.
This allows to track ETL file generation and ETL's dependencies.
Estava lendo o texto o artigo "QuantLib, dez anos depois …" (https://www.wilsonfreitas.net/posts/2023-07-19-quantlib/quantlib.html) que utiliza os dados deste projeto.
Agora que não existe mais o servidor de download o caminho para gerar o histórico é coletar diariamente os dados da página url: https://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-ajustes-do-pregao-ptBR.asp?
Obrigado
It is necessary to handle imported files with templates.
The source encompasses more than just downloaded files; imported files are also included and must be duly logged within the cache management system.
Download/import workflow:
Improve display of brasa.cli
The template filename and id must be the same.
This must be checked during the template loading.
Use logging to register every important detail, for example, company-info, company-details, ...
It is important to register events like download fails, and its causes.
Format returned tables
Parsers
Utils
New files
The templates provided below include a 'refdate' parameter, which specifies the initial dates or periods to be configured when initiating the download of all files.
Build a command to execute the initial setup downloading all files related to these templates.
Build a graph to connect templates.
Connected templates improve the dependency system.
Luigi should be used as the dependency system.
Each asset (equity, company, curve, ...) has its symbol and structure (columns, for example).
A central table with all assets would facilitate access to all assets.
Queries could be made and return assets that match and one function like get_asset
would be used to return a dataset corresponding to the requested asset.
get_asset("DI1F16") # returns dataset with historical data of DI1F16 future contract
get_asset("DI1") # returns dataset with historical data of all DI1 future contracts
get_asset("DI1T") # returns dataset with historical data of DI1 curve based on traded contracts
get_asset("DI1ST") # returns dataset with historical data of DI1 curve with standardized terms
get_asset("PETR4") # returns dataset with historical data of PETR4
Tesouro Direto files can be downloaded from the same URL but the arguments vary.
The easy solution is to create different template files, one for each varied argument.
The following questions must be answered:
Proposed structure to downloaded files.
raw/<template>/<varied-argument>/checksum/file
another proposal
raw/<template>-<varied-argument>/checksum/file
Template example:
id: td-historical-prices
filename: ~
filetype: CUSTOM
description: Tesouro Direto preços e taxas históricas
downloader:
verifyssl: false
function: brasa.downloaders.multi_download
url: https://cdn.tesouro.gov.br/sistemas-internos/apex/producao/sistemas/sistd/{year}/{contract}_{year}.xls
format: xls
use-filename: true
args:
year: ~
contract: ~
multi:
contract:
lft: LFT
ltn: LTN
ntnb: NTN-B
ntnb_principal: NTN-B_Principal
ntnf: NTN-F
ntnc: NTN-C
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.