Giter VIP home page Giter VIP logo

abjdata's Issues

Espaço na sigla na base pnud_siglas

Oi ! estava tentando gerar um arquivo de siglas para a pnud_min, e percebi que tem uma sigla com um espaço extra. Arrumei com um case_when simples mas caso queiram alterar no pacote, é na base abjData::pnud_siglas, coluna sigla, e a sigla que contém um espaço extra é a idhm_ e.

Ex:

abjData::pnud_siglas |>
  mutate(
    sigla = case_when(
      sigla == "idhm_ e" ~ "idhm_e",      
      TRUE ~ sigla
    )
  ) |>
  filter(sigla %in% names(abjData::pnud_min))

Missing municipalities in Brazil

When you run the command:

pnud_muni %>% filter(ano == 2010) %>% select(municipio) %>% count()

you get to know that there are 5565 municipalities in the pnud database.

However, it is known that there are 5570 municipalities in Brazil.

I guess there are some missing municipalities in pnud database. However I still am not sure which ones are missing.

Release abjData 1.0.0

Prepare for release:

  • Check that description is informative
  • Check licensing of included files
  • devtools::build_readme()
  • usethis::use_cran_comments()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('major')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_news_md()
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Update install instructions in README
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

Objeto br_uf_map ausente na nova versão

Olá, tudo bem?
Primeiro eu queria agradecer pelo pacote que vocês disponibilizam. Tem sido de grande ajuda.

Um dos objetos que utilizamos é o br_uf_map, que parece ausente nessa nova versão. É possível ainda ter acesso a ele de alguma forma? Ele teria trocado de nome?

Webscrap TJSP 1 Inst

Boa tarde,

Estou realizando o webscrap da página de processos em 1 instância do TJSP pelo R.
Realizei até o momento o seguinte código:

library(devtools);
library(esaj);
library(xml2);
library(rvest);
library(httr);
library(stringr)

ids <- c( "0000337-06.2013.8.26.0431",
"0000536-96.2011.8.26.0431",
"0000566-63.2013.8.26.0431",
"0000743-27.2013.8.26.0431",
"0001076-18.2009.8.26.0431",
"0001861-09.2011.8.26.0431",
"0001891-10.2012.8.26.0431",
"0002500-32.2008.8.26.0431",
"0002583-43.2011.8.26.0431",
"0003372-08.2012.8.26.0431",
"0003389-44.2012.8.26.0431",
"0003457-91.2012.8.26.0431",
"0003697-22.2008.8.26.0431",
"0004104-57.2010.8.26.0431",
"0004924-08.2012.8.26.0431")
esaj::download_cpopg(ids, "C:/Users/Danilo/Desktop/bases/html/")
h <- arq %>%
xml2::read_html(encoding = 'UTF-8')
todas_partes <- h %>% rvest::html_nodes('table') %>% length()

##RETIRA A TABELA PAI DO ARQUIVO HTML
nodes <- h %>% rvest::html_nodes('table')

##RETIRA A '' PAI DO ARQUIVO HTML

lixo <- nodes %>% rvest::html_nodes('td')

retira tr do arquivo

lixo2 <- lixo %>% rvest::html_nodes('tr')

##Limpa o arquivo
lixo3<- lixo2 %>%
rvest::html_text() %>%
stringr::str_trim() %>%
stringr::str_replace_all('&nbsp', '') %>%
stringr::str_replace_all('[\t]', '') %>%
stringr::str_replace_all('[\n]', '') %>%
stringr::str_replace_all('[\r]', '') %>%
data.frame(stringsAsFactors = FALSE)

No entanto ao final tenho uma tabela onde cada elemento possui diversas informações [29] por exemplo contem todas as informações de "Dados do processo".
Neste sentido, devo fazer um while para separar o conteúdo em colunas ?
qual a melhor prática para construir um base com os dados de processos do TJ-SP?

Agradeço antecipadamente a ajuda;

Abraço,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.